CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters.

  • 所有者: yahoo/CaffeOnSpark
  • 平台:
  • 許可證: Apache License 2.0
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

Note: we're lovingly marking this project as Archived since we're no longer supporting it. You are welcome to read the code and fork your own version of it and continue to use this code under the terms of the project license.

CaffeOnSpark

What's CaffeOnSpark?

CaffeOnSpark brings deep learning to Hadoop and Spark clusters. By
combining salient features from deep learning framework
Caffe and big-data frameworks Apache
Spark
and Apache Hadoop, CaffeOnSpark enables distributed
deep learning on a cluster of GPU and CPU servers.

As a distributed extension of Caffe, CaffeOnSpark supports neural
network model training, testing, and feature extraction. Caffe users
can now perform distributed learning using their existing LMDB data
files and minorly adjusted network configuration (as
illustrated).

CaffeOnSpark is a Spark package for deep learning. It is complementary
to non-deep learning libraries MLlib and Spark SQL.
CaffeOnSpark's Scala API provides Spark applications with an easy
mechanism to invoke deep learning (see
sample)
over distributed datasets.

CaffeOnSpark was developed by Yahoo for large-scale distributed deep
learning on our Hadoop
clusters

in Yahoo's private cloud. It's been in use by Yahoo for image search,
content classification and several other use cases.

Why CaffeOnSpark?

CaffeOnSpark provides some important benefits (see our blog) over alternative deep learning solutions.

  • It enables model training, test and feature extraction directly on Hadoop datasets stored in HDFS on Hadoop clusters.
  • It turns your Hadoop or Spark cluster(s) into a powerful platform for deep learning, without the need to set up a new dedicated cluster for deep learning separately.
  • Server-to-server direct communication (Ethernet or InfiniBand) achieves faster learning and eliminates scalability bottleneck.
  • Caffe users' existing datasets (e.g. LMDB) and configurations could be applied for distributed learning without any conversion needed.
  • High-level API empowers Spark applications to easily conduct deep learning.
  • Incremental learning is supported to leverage previously trained models or snapshots.
  • Additional data formats and network interfaces could be easily added.
  • It can be easily deployed on public cloud (ex. AWS EC2) or a private cloud.

Using CaffeOnSpark

Please check CaffeOnSpark wiki site for detailed
documentations such as building instruction, API
reference

and getting started guides for standalone
cluster
and AWS EC2
cluster
.

  • Batch sizes specified in prototxt files are per device.
  • Memory layers should not be shared among GPUs, and thus "share_in_parallel: false" is required for layer configuration.

Building for Spark 2.X

CaffeOnSpark supports both Spark 1.x and 2.x. For Spark 2.0, our default settings are:

  • spark-2.0.0
  • hadoop-2.7.1
  • scala-2.11.7
    You may want to adjust them in caffe-grid/pom.xml.

Mailing List

Please join CaffeOnSpark user
group
for
discussions and questions.

License

The use and distribution terms for this software are covered by the
Apache 2.0 license. See LICENSE file for terms.

主要指標

概覽
名稱與所有者yahoo/CaffeOnSpark
主編程語言Jupyter Notebook
編程語言Makefile (語言數: 8)
平台
許可證Apache License 2.0
所有者活动
創建於2016-01-11 19:21:31
推送於2019-11-15 21:44:39
最后一次提交2019-11-15 13:44:38
發布數0
用户参与
星數1.3k
關注者數145
派生數354
提交數261
已啟用問題?
問題數227
打開的問題數78
拉請求數62
打開的拉請求數1
關閉的拉請求數13
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?