BigDL

BigDL: Distributed Deep Learning Library for Apache Spark

Github星跟蹤圖


BigDL: Distributed Deep Learning on Apache Spark

What is BigDL?

BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can write their deep learning applications as standard Spark programs, which can directly run on top of existing Spark or Hadoop clusters. To makes it easy to build Spark and BigDL applications, a high level Analytics Zoo is provided for end-to-end analytics + AI pipelines.

  • Rich deep learning support. Modeled after Torch, BigDL provides comprehensive support for deep learning, including numeric computing (via Tensor) and high level neural networks; in addition, users can load pre-trained Caffe or Torch models into Spark programs using BigDL.

  • Extremely high performance. To achieve high performance, BigDL uses Intel MKL / Intel MKL-DNN and multi-threaded programming in each Spark task. Consequently, it is orders of magnitude faster than out-of-box open source Caffe, Torch or TensorFlow on a single-node Xeon (i.e., comparable with mainstream GPU). With adoption of Intel DL Boost, BigDL improves inference latency and throughput significantly.

  • Efficiently scale-out. BigDL can efficiently scale out to perform data analytics at "Big Data scale", by leveraging Apache Spark (a lightning fast distributed data processing framework), as well as efficient implementations of synchronous SGD and all-reduce communications on Spark.

Why BigDL?

You may want to write your deep learning programs using BigDL if:

  • You want to analyze a large amount of data on the same Big Data (Hadoop/Spark) cluster where the data are stored (in, say, HDFS, HBase, Hive, etc.).

  • You want to add deep learning functionalities (either training or prediction) to your Big Data (Spark) programs and/or workflow.

  • You want to leverage existing Hadoop/Spark clusters to run your deep learning applications, which can be then dynamically shared with other workloads (e.g., ETL, data warehouse, feature engineering, classical machine learning, graph analytics, etc.)

How to use BigDL?

主要指標

概覽
名稱與所有者intel/ipex-llm
主編程語言Python
編程語言Scala (語言數: 8)
平台
許可證Apache License 2.0
所有者活动
創建於2016-08-29 07:59:50
推送於2025-06-11 02:08:44
最后一次提交2022-12-13 09:04:59
發布數24
最新版本名稱v2.3.0-nightly (發布於 )
第一版名稱v0.1.0 (發布於 2017-04-07 11:44:36)
用户参与
星數8k
關注者數257
派生數1.4k
提交數4.1k
已啟用問題?
問題數2931
打開的問題數1175
拉請求數8948
打開的拉請求數276
關閉的拉請求數1039
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?