datalab

Interactive tools and developer experiences for Big Data on Google Cloud Platform.

Github星跟踪图

Google Cloud DataLab

Google Cloud DataLab provides a productive, interactive, and
integrated tool to explore, visualize, analyze and transform data, bringing together the power of
Python, SQL, JavaScript, and the Google Cloud Platform with services such as
BigQuery and Storage.

Google Cloud Datalab Beta

DataLab builds on the interactive notebooks, and the foundation of Jupyter
(formerly IPython) to enable developers, data scientists and data analysts to easily work with
their data from exploration to developing and deploying data pipelines, all within notebooks.

DataLab deeply integrates into Google Cloud Platform to allow users to extract insights and harness
the full value of their data. It provides a secure environment for members of a cloud project
to effortlessly access data and resources accessible from the project, and to share notebooks via
git.

You can see an example of the notebooks by browsing through the
samples and documentation,
which are themselves written in the form of notebooks.

Using DataLab and Getting Started

DataLab is packaged as a docker container which contains Jupyter/IPython, and a variety of python
libraries such as numpy, pandas, scikit-learn and matplotlib, in a ready-to-use form.

You can run the docker container locally or in GCE, as described in the
wiki.

Contacting Us

For support or help using DataLab, please submit questions tagged with google-cloud-datalab on StackOverflow.

For any product issues, you can either submit issues
here on this project page, or you can submit your feedback using the feedback link available
within the product.

Developing DataLab

Building and Running

The wiki describes
the process of setting up a local development environment, as well as the steps to build and run,
and the developer workflow.

Contributing

Contributions are welcome! Please see our roadmap
page. Please check the page on contributing
for more details.

You can always contribute even without code submissions by submitting issues and suggestions to
help improve DataLab and building and sharing samples and being a member of the community.

Testing

Please take a look at the test directory for instructions on how to run tests locally.

Repository Overview

This is a quick description of the repository structure to help understand and
discover the relevant pieces.

All source code corresponding to product functionality that is built exists
within /sources. The following is a list of the individual components:

  • /sources/lib - set of python libraries used to implement APIs to access Google
    Cloud Platform services, and implement the DataLab interactive experience.

    • api: Google Cloud Platform APIs (currently: BigQuery and Cloud Storage).
    • datalab: interactive notebook experience to plug into Jupyter and IPython.
  • /sources/web - the DataLab web server. This is implemented in node.js and
    serves the DataLab front-end experience - both content and APIs, as well as backend
    infrastructure such as notebook source control.
    Some of the requests are proxied to the Jupyter notebook server, which manages notebooks and
    associated kernel sessions.

  • /sources/tools - miscellaneous other supporting tools.

Source code builds into the /build directory, and the generated build outputs are
consumed when building the DataLab docker container.

The build outputs are packaged in the form of a docker container.

  • /containers/datalab - the only container for now. This is the container that is used as the
    DataLab AppEngine module.

主要指标

概览
名称与所有者googledatalab/datalab
主编程语言TypeScript
编程语言Shell (语言数: 10)
平台
许可证Apache License 2.0
所有者活动
创建于2014-08-07 02:37:34
推送于2022-09-03 04:50:03
最后一次提交2022-09-03 04:50:03
发布数24
最新版本名称v1.2.20210816 (发布于 )
第一版名称v0.5.20151012-beta (发布于 )
用户参与
星数1k
关注者数77
派生数248
提交数2.1k
已启用问题?
问题数891
打开的问题数224
拉请求数1153
打开的拉请求数8
关闭的拉请求数132
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?