InterpretML

拟合可解释的机器学习模型。解释黑盒机器学习。「Fit interpretable machine learning models. Explain blackbox machine learning.」

InterpretML - Alpha Release

License
Python Version
Package Version
Build Status
Coverage
Maintenance

In the beginning machines learned in darkness, and data scientists struggled in the void to explain them.

Let there be light.

InterpretML is an open-source python package for training interpretable machine learning models and explaining blackbox systems. Interpretability is essential for:

  • Model debugging - Why did my model make this mistake?
  • Detecting bias - Does my model discriminate?
  • Human-AI cooperation - How can I understand and trust the model's decisions?
  • Regulatory compliance - Does my model satisfy legal requirements?
  • High-risk applications - Healthcare, finance, judicial, ...

Historically, the most interpretable machine learning models were not very accurate, and the most accurate models were not very interpretable. Microsoft Research has developed an algorithm called the Explainable Boosting Machine (EBM)* which has both high accuracy and interpretability. EBM uses modern machine learning techniques like bagging and boosting to breathe new life into traditional GAMs (Generalized Additive Models). This makes them as accurate as random forests and gradient boosted trees, and also enhances their intelligibility and editability.

Notebook for reproducing table, Dataset/AUROC, Domain, Logistic Regression, Random Forest, XGBoost, Explainable Boosting Machine, ---------------, ---------, :-------------------:, :-------------:, :--------------:, :----------------------------:, Adult Income, Finance, .907±.003, .903±.002, .922±.002, .928±.002, Heart Disease, Medical, .895±.030, .890±.008, .870±.014, .916±.010, Breast Cancer, Medical, .995±.005, .992±.009, .995±.006, .995±.006, Telecom Churn, Business, .804±.015, .824±.002, .850±.006, .851±.005, Credit Fraud, Security, .979±.002, .950±.007, .981±.003, .975±.005,

In addition to EBM, InterpretML also supports methods like LIME, SHAP, linear models, partial dependence, decision trees and rule lists. The package makes it easy to compare and contrast models to find the best one for your needs.

* EBM is a fast implementation of GA2M. Details on the algorithm can be found here.


Installation

Python 3.5+, Linux, Mac OS X, Windows

pip install -U interpret

Getting Started

Let's fit an Explainable Boosting Machine

from interpret.glassbox import ExplainableBoostingClassifier

ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

# EBM supports pandas dataframes, numpy arrays, and handles "string" data natively.

Understand the model

from interpret import show

ebm_global = ebm.explain_global()
show(ebm_global)

Global Explanation Image

The graphs are the entire model.* For regression, sum the scores from each graph to get your prediction. For classification, sum the scores and take the softmax. Nothing is hidden. You can inspect everything.

Understand individual predictions

ebm_local = ebm.explain_local(X_test, y_test)
show(ebm_local)

Local Explanation Image

And if you have multiple models, compare them

show([logistic_regression, decision_tree])

Dashboard Image

Example Notebooks

Roadmap

Currently we're working on:

  • R language interface (R is currently a WIP. Basic EBM classification can be done via the ebm_classify & ebm_predict_proba functions, but the predictions are a bit less accurate than in python. No plotting included yet, but other R plotting tools can do a basic job visualizing EBM models)
  • Missing Values Support
  • Improved Categorical Encoding
  • Interaction effect purification (see citations for details)

...and lots more! Get in touch to find out more.

Contributing

If you are interested contributing directly to the code base, please see CONTRIBUTING.md.

Acknowledgements

InterpretML was originally created by (equal contributions): Samuel Jenkins & Harsha Nori & Paul Koch & Rich Caruana

Many people have supported us along the way. Check out ACKNOWLEDGEMENTS.md!

We also build on top of many great packages. Please check them out!

plotly, dash, scikit-learn, lime, shap, salib, skope-rules, treeinterpreter, gevent, joblib, pytest, jupyter

Citations


Paper link

Contact us

There are multiple ways to get in touch:

If a tree fell in your random forest, would anyone notice?

主要指標

概覽
名稱與所有者interpretml/interpret
主編程語言C++
編程語言Batchfile (語言數: 15)
平台
許可證MIT License
所有者活动
創建於2019-05-03 05:47:52
推送於2025-04-25 06:30:48
最后一次提交
發布數51
最新版本名稱v0.6.10 (發布於 )
第一版名稱v0.1.0 (發布於 )
用户参与
星數6.5k
關注者數144
派生數746
提交數3.7k
已啟用問題?
問題數467
打開的問題數103
拉請求數103
打開的拉請求數3
關閉的拉請求數28
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?