spline

Data Lineage Tracking and Visualization tool for Apache Spark ™

Github stars Tracking Chart

Spline (from Spark lineage) project helps people get insight into data processing performed by Apache Spark ™

Maven Central
TeamCity build (develop)
Codacy Badge
Sonarcloud Status
SonarCloud Maintainability
SonarCloud Reliability
SonarCloud Security

The project consists of three main parts:

  • Spark Agent that sits on drivers, capturing the data lineage from Spark jobs being executed by analyzing the execution plans

  • Rest Gateway, that receive the lineage data from agent and stores it in the database

  • Web UI application that visualizes the stored data lineages

Spline diagram

Spline is aimed to be used with Spark 2.3+ but also provides limited support for Spark 2.2.

For documentation and examples please visit Spline GitHub Pages.


Copyright 2019 ABSA Group Limited

you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Overview

Name With OwnerAbsaOSS/spline
Primary LanguageScala
Program languageScala (Language Count: 6)
Platform
License:Apache License 2.0
Release Count40
Last Release Namerelease/0.7.9 (Posted on 2023-08-17 18:24:24)
First Release Namerelease/0.2.0 (Posted on 2017-08-09 11:53:06)
Created At2017-05-30 08:38:00
Pushed At2024-05-05 17:55:12
Last Commit At
Stargazers Count581
Watchers Count40
Fork Count151
Commits Count1.5k
Has Issues Enabled
Issues Count588
Issue Open Count51
Pull Requests Count485
Pull Requests Open Count5
Pull Requests Close Count87
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top