Python Data Science Handbook

Python 数据科学手册:全文见于 Jupyter Notebooks。「Python Data Science Handbook: full text in Jupyter Notebooks」

Github stars Tracking Chart

Python 数据科学手册

这个资源库包含了整个 Python 数据科学手册,以(免费!)Jupyter notebooks 的形式存在。

如何使用这本书

关于我们

本书是用 Python 3.5 编写和测试的,不过其他 Python 版本(包括 Python 2.7)几乎在所有情况下都应该可以使用。

本书介绍了用 Python 处理数据所必需的核心库:特别是 IPythonNumPyPandasMatplotlibScikit-Learn 以及相关的软件包。假设你熟悉 Python 这门语言,如果你需要快速介绍这门语言本身,请看免费的配套项目 《Python 的旋风之旅》:这是一个针对研究人员和科学家的快节奏的 Python 语言介绍。

参见 Index.ipynb,查看随文提供的 notebooks 索引。

软件

本书中的代码是在 Python 3.5 中测试的,尽管大多数(但不是全部)也能在 Python 2.7 和其他较旧的 Python 版本中正常工作。

我用来运行书中代码的包列在 requirements.txt 中 (请注意,其中一些确切的版本号可能在你的平台上不可用:你可能必须为自己的使用而调整它们)。要使用 conda 安装需求,在命令行中运行以下命令。

$ conda install --file requirements.txt

要用 Python 3.5 和所有需要的软件包版本创建一个名为 PDSH 的独立环境,运行以下命令。

$ conda create -n PDSH python=3.5 --file requirements.txt

您可以在 conda 文档的管理环境部分阅读更多关于使用 conda 环境的内容。

许可证

代码

这个资源库中的代码,包括上面列出的笔记本中的所有代码样本,都是在 MIT 许可 下发布的。更多内容请访问开源计划

文本

本书文字内容以 CC-BY-NC-ND 许可 发布。阅读更多内容,请访问 Creative Commons


(The first version translated by vz on 2020.09.20)

Overview

Name With Ownerjakevdp/PythonDataScienceHandbook
Primary LanguageJupyter Notebook
Program languageJupyter Notebook (Language Count: 6)
PlatformLinux, Mac, Windows
License:MIT License
Release Count0
Created At2016-08-10 14:24:36
Pushed At2024-03-30 18:38:03
Last Commit At2023-05-05 16:20:45
Stargazers Count41.4k
Watchers Count1.8k
Fork Count17.5k
Commits Count235
Has Issues Enabled
Issues Count190
Issue Open Count114
Pull Requests Count28
Pull Requests Open Count102
Pull Requests Close Count73
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private

Python Data Science Handbook

Binder
Colab

This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.

cover image

How to Use this Book

About

The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.

The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages.
Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project,
A Whirlwind Tour of Python: it's a fast-paced introduction to the Python language aimed at researchers and scientists.

See Index.ipynb for an index of the notebooks available to accompany the text.

Software

The code in the book was tested with Python 3.5, though most (but not all) will also work correctly with Python 2.7 and other older Python versions.

The packages I used to run the code in the book are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use).
To install the requirements using conda, run the following at the command-line:

$ conda install --file requirements.txt

To create a stand-alone environment named PDSH with Python 3.5 and all the required package versions, run the following:

$ conda create -n PDSH python=3.5 --file requirements.txt

You can read more about using conda environments in the Managing Environments section of the conda documentation.

License

Code

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

Text

The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.

To the top