shifu

An end-to-end machine learning and data mining framework on Hadoop

Github星跟蹤圖

Build StatusMaven Central

Download

Please download latest shifu here.

Getting Started

After shifu downloading, build your first model with Shifu tutorial. More details about shifu can be found in our wiki pages.

What is Shifu?

Shifu is an open-source, end-to-end machine learning and data mining framework built on top of Hadoop. Shifu is designed for data scientists, simplifying the life-cycle of building machine learning models. While originally built for fraud modeling, Shifu is generalized for many other modeling domains.

One of Shifu's pros is an end-to-end modeling pipeline in machine learning. With only configurations settings, a whole machine pipeline can be built and model can be much more easy to develop and push to production. The pipeline defined in Shifu is in below:

Shifu Pipeline

Shifu provides a simple command-line interface for each step of the model building process, including

Shifu’s fast Hadoop-based, distributed neural network / logistic regression / gradient boosted trees training can reduce model training time from days to hours on TB data sets. Shifu integrates with Pig workflows on Hadoop, and Shifu-trained models can be integrated into production code with a simple Java API. Shifu leverages Pig, Akka, Encog and other open source projects.

Guagua, an in-memory iterative computing framework on Hadoop YARN is developed as sub-project of Shifu to accelerate training progress.

More details about shifu can be found in our wiki pages

Conference

Contributors

Google Group

Please join Shifu group if questions, bugs or anything else.

Copyright 2012-2019, PayPal Software Foundation under the Apache License.

主要指標

概覽
名稱與所有者ShifuML/shifu
主編程語言Java
編程語言Shell (語言數: 4)
平台
許可證Apache License 2.0
所有者活动
創建於2014-04-21 22:21:09
推送於2024-05-13 05:32:43
最后一次提交2021-04-30 13:01:46
發布數25
最新版本名稱0.12.5 (發布於 )
第一版名稱shifu-0.2.2 (發布於 2014-05-06 11:30:14)
用户参与
星數252
關注者數40
派生數108
提交數2.5k
已啟用問題?
問題數444
打開的問題數229
拉請求數281
打開的拉請求數9
關閉的拉請求數49
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?