pmdarima

一个旨在填补 Python 时间序列分析功能空白的统计库,包括 R 的 auto.arima 函数的等效项。「A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.」

Github stars Tracking Chart

pmdarima

Pmdarima(最初为 pyramid-arima,用“py” + “arima” 的回文构词)是一个统计库,旨在填补 Python 时间序列分析功能中的空白。这包括:

  • 等效于 R 的 auto.arima 功能
  • 平稳性和季节性统计检验的集合
  • 时间序列实用程序,例如微分和逆微分
  • 大量内生和外生的变形器和特征器,包括 Box-Cox 和 Fourier 转换
  • 季节性时间序列分解
  • 交叉验证实用程序
  • 丰富的内置时间序列数据集,用于原型制作和示例
  • Scikit-learn-esque 管道可以合并您的评估并促进生产

Pmdarima 将 statsmodels 隐藏在内部,但是它的设计界面对于有 scikit-learn 背景的用户来说很熟悉。

安装

Pmdarima 在 pypi 上具有 Windows、Mac 和 Linux(manylinux)的二进制和源发行版,软件包名称为 pmdarima,可通过 pip 下载:

$ pip install pmdarima

快速入门示例

wineind 数据集上拟合一个简单的自动 ARIMA:

import pmdarima as pm
from pmdarima.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt

# Load/split your data
y = pm.datasets.load_wineind()
train, test = train_test_split(y, train_size=150)

# Fit your model
model = pm.auto_arima(train, seasonal=True, m=12)

# make your forecasts
forecasts = model.predict(test.shape[0])  # predict N steps into the future

# Visualize the forecasts (blue=train, green=forecasts)
x = np.arange(y.shape[0])
plt.plot(x[:150], train, c='blue')
plt.plot(x[150:], forecasts, c='green')
plt.show()

Wineind example

sunspots 数据集上装配一个更复杂的管道,序列化它,然后从磁盘加载它来进行预测。

import pmdarima as pm
from pmdarima.model_selection import train_test_split
from pmdarima.pipeline import Pipeline
from pmdarima.preprocessing import BoxCoxEndogTransformer
import pickle

# Load/split your data
y = pm.datasets.load_sunspots()
train, test = train_test_split(y, train_size=2700)

# Define and fit your pipeline
pipeline = Pipeline([
    ('boxcox', BoxCoxEndogTransformer(lmbda2=1e-6)),  # lmbda2 avoids negative values
    ('arima', pm.AutoARIMA(seasonal=True, m=12,
                           suppress_warnings=True,
                           trace=True))
])

pipeline.fit(train)

# 像在 scikit 中一样序列化模型:
with open('model.pkl', 'wb') as pkl:
    pickle.dump(pipeline, pkl)
    
# 加载并无缝进行预测:
with open('model.pkl', 'rb') as pkl:
    mod = pickle.load(pkl)
    print(mod.predict(15))
# [25.20580375 25.05573898 24.4263037  23.56766793 22.67463049 21.82231043
# 21.04061069 20.33693017 19.70906027 19.1509862  18.6555793  18.21577243
# 17.8250318  17.47750614 17.16803394]

可用性

pmdarima 在 PyPi 上可用于以下平台的 Python 3.5+ 的预构建 Wheel 文件中:

  • Mac (64-bit)
  • Linux (64-bit manylinux)
  • Windows (32 & 64-bit)

如果您的平台不存在轮子,您仍然可以点安装,它将通过源代码分发包构建,但是您需要cython> = 0.29和gcc(Mac / Linux)或MinGW(Windows)才能 从源代码构建软件包。

请注意,旧版本(<1.0.0)的名称为“ pyramid-arima”,可以通过以下方式进行pip安装:

# Legacy warning:
$ pip install pyramid-arima
# python -c 'import pyramid;'

但是,不建议这样做。

文档

pmdarima 文档可以回答您的所有问题以及更多问题(包括示例和指南)。 如果没有,请随时提出问题。


(The first version translated by vz on 2020.07.19)

Main metrics

Overview
Name With Owneralkaline-ml/pmdarima
Primary LanguagePython
Program languageShell (Language Count: 6)
PlatformLinux, Mac, Windows
License:MIT License
所有者活动
Created At2017-03-30 14:58:30
Pushed At2024-11-13 21:41:20
Last Commit At2024-11-07 17:05:03
Release Count45
Last Release Namev2.0.4 (Posted on )
First Release Namev0.1-alpha (Posted on )
用户参与
Stargazers Count1.6k
Watchers Count35
Fork Count239
Commits Count1.1k
Has Issues Enabled
Issues Count335
Issue Open Count61
Pull Requests Count231
Pull Requests Open Count5
Pull Requests Close Count20
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private

pmdarima

PyPI version
CircleCI
Build Status
Github Actions Status
codecov
Supported versions
Downloads
Downloads/Week

Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical
library designed to fill the void in Python's time series analysis capabilities. This includes:

  • The equivalent of R's auto.arima functionality
  • A collection of statistical tests of stationarity and seasonality
  • Time series utilities, such as differencing and inverse differencing
  • Numerous endogenous and exogenous transformers and featurizers, including Box-Cox and Fourier transformations
  • Seasonal time series decompositions
  • Cross-validation utilities
  • A rich collection of built-in time series datasets for prototyping and examples
  • Scikit-learn-esque pipelines to consolidate your estimators and promote productionization

Pmdarima wraps statsmodels
under the hood, but is designed with an interface that's familiar to users coming
from a scikit-learn background.

Installation

Pmdarima has binary and source distributions for Windows, Mac and Linux (manylinux) on pypi
under the package name pmdarima and can be downloaded via pip:

$ pip install pmdarima

Conda distributions are also available for Windows (64-bit only), Mac and Linux using Python 3.6 or 3.7:

$ conda install -c alkaline-ml pmdarima

Quickstart Examples

Fitting a simple auto-ARIMA on the wineind dataset:

import pmdarima as pm
from pmdarima.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt

# Load/split your data
y = pm.datasets.load_wineind()
train, test = train_test_split(y, train_size=150)

# Fit your model
model = pm.auto_arima(train, seasonal=True, m=12)

# make your forecasts
forecasts = model.predict(test.shape[0])  # predict N steps into the future

# Visualize the forecasts (blue=train, green=forecasts)
x = np.arange(y.shape[0])
plt.plot(x[:150], train, c='blue')
plt.plot(x[150:], forecasts, c='green')
plt.show()

Fitting a more complex pipeline on the sunspots dataset,
serializing it, and then loading it from disk to make predictions:

import pmdarima as pm
from pmdarima.model_selection import train_test_split
from pmdarima.pipeline import Pipeline
from pmdarima.preprocessing import BoxCoxEndogTransformer
import pickle

# Load/split your data
y = pm.datasets.load_sunspots()
train, test = train_test_split(y, train_size=2700)

# Define and fit your pipeline
pipeline = Pipeline([
    ('boxcox', BoxCoxEndogTransformer(lmbda2=1e-6)),  # lmbda2 avoids negative values
    ('arima', pm.AutoARIMA(seasonal=True, m=12,
                           suppress_warnings=True,
                           trace=True))
])

pipeline.fit(train)

# Serialize your model just like you would in scikit:
with open('model.pkl', 'wb') as pkl:
    pickle.dump(pipeline, pkl)
    
# Load it and make predictions seamlessly:
with open('model.pkl', 'rb') as pkl:
    mod = pickle.load(pkl)
    print(mod.predict(15))
# [25.20580375 25.05573898 24.4263037  23.56766793 22.67463049 21.82231043
# 21.04061069 20.33693017 19.70906027 19.1509862  18.6555793  18.21577243
# 17.8250318  17.47750614 17.16803394]

Availability

pmdarima is available on PyPi in pre-built Wheel files for Python 3.5+ for the following platforms:

  • Mac (64-bit)
  • Linux (64-bit manylinux)
  • Windows (32 & 64-bit)

It is also available on conda for Python 3.6 and 3.7 for the following platforms:

  • Mac (64-bit)
  • Linux (64-bit)
  • Windows (64-bit)

If a wheel doesn't exist for your platform, you can still pip install and it
will build from the source distribution tarball, however you'll need cython>=0.29
and gcc (Mac/Linux) or MinGW (Windows) in order to build the package from source.

Note that legacy versions (<1.0.0) are available under the name
"pyramid-arima" and can be pip installed via:

# Legacy warning:
$ pip install pyramid-arima
# python -c 'import pyramid;'

However, this is not recommended.

Documentation

All of your questions and more (including examples and guides) can be answered by
the pmdarima documentation. If not, always
feel free to file an issue.