awesome-machine-learning-interpretability 
A curated, but probably biased and incomplete, list of awesome machine learning interpretability resources.
If you want to contribute to this list (and please do!) read over the contribution guidelines, send a pull request, or contact me @jpatrickhall.
An incomplete, imperfect blueprint for a more human-centered, lower-risk machine learning. The resources in this repository can be used to do many of these things today. The resources in this repository should not be considered legal compliance advice.
Image credit: H2O.ai Machine Learning Interpretability team, https://github.com/h2oai/mli-resources.
Table of Contents
- Comprehensive Software Examples and Tutorials
- Explainability- or Fairness-Enhancing Software Packages
- Free Books
- Other Interpretability and Fairness Resources and Lists
- Review and General Papers
- Teaching Resources
- Interpretable ("Whitebox") or Fair Modeling Packages
Comprehensive Software Examples and Tutorials
- Getting a Window into your Black Box Model
- IML
- Interpretable Machine Learning with Python
- Interpreting Machine Learning Models with the iml Package
- Interpretable Machine Learning using Counterfactuals
- Machine Learning Explainability by Kaggle Learn
- Model Interpretability with DALEX
- Model Interpretation series by Dipanjan (DJ) Sarkar:
- Partial Dependence Plots in R
- Saliency Maps for Deep Learning
- Visualizing ML Models with LIME
- Visualizing and debugging deep convolutional networks
- What does a CNN see?
Explainability- or Fairness-Enhancing Software Packages
Browser
Python
- acd
- aequitas
- AI Fairness 360
- AI Explainability 360
- allennlp
- algofairness
- Alibi
- anchor
- BlackBoxAuditing
- casme
- captum
- ContrastiveExplanation (Foil Trees)
- DALEXtra
- DeepExplain
- deeplift
- deepvis
- eli5
- fairml
- fairness-comparison
- fairness_measures_code
- foolbox
- Grad-CAM (GitHub topic)
- iNNvestigate neural nets
- Integrated-Gradients
- interpret_with_rules
- Keras-vis
- keract
- lofo-importance
- L2X
- lime
- lrp_toolbox
- microsoft/interpret
- MLextend
- PDPbox
- pyBreakDown
- PyCEbox
- rationale
- robustness
- RISE
- SALib
- shap
- Skater
- tensorfow/cleverhans
- tensorflow/lucid
- tensorflow/model-analysis
- tensorflow/privacy
- tensorflow/tcav
- tensorfuzz
- TensorWatch
- tf-explain
- Themis
- themis-ml
- treeinterpreter
- woe
- xai
- yellowbrick
R
- ALEPlot
- breakDown
- DrWhyAI
- DALEX
- DALEXtra
- EloML
- ExplainPrediction
- featureImportance
- forestmodel
- fscaret
- ICEbox
- iml
- lightgbmExplainer
- lime
- live
- mcr
- modelDown
- modelOriented
- modelStudio
- pdp
- shapleyR
- shapper
- smbinning
- vip
- xgboostExplainer
Free Books
- An Introduction to Machine Learning Interpretability
- Fairness and Machine Learning
- Interpretable Machine Learning
Other Interpretability and Fairness Resources and Lists
- 8 Principles of Responsible ML
- ACM FAT* 2019 Youtube Playlist
- AI Ethics Guidelines Global Inventory
- AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
- Awesome interpretable machine learning ;)
- Awesome machine learning operations
- algoaware
- Beyond Explainability: A Practical Guide to Managing Risk in Machine Learning Models
- criticalML
- Debugging Machine Learning Models (ICLR workshop proceedings)
- Deep Insights into Explainability and Interpretability of Machine Learning Algorithms and Applications to Risk Management
- Distill
- Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) Scholarship
- General principles for the use of Artificial Intelligence in the financial sector
- Opinion of the German Data Ethics Commission
- Machine Learning Ethics References
- Machine Learning Interpretability Resources
- MIT AI Ethics Reading Group
- private-ai-resources
- Real-World Model Debugging Strategies
- Singapore Personal Data Protection Commission (PDPC) Model Artificial Intelligence Governance Framework
- Testing and Debugging in Machine Learning
- Troubleshooting Deep Neural Networks
- Trump Administration Draft Guidance for Regulation of Artificial Intelligence Applications
- U.K. Information Commissioner's Office (ICO) AI Audting Framework (overview series)
- AI Principles: Recommendations on the Ethical Use of Artificial Intelligence by the Department of Defense
- Warning Signs: The Future of Privacy and Security in an Age of Machine Learning
- XAI Resources
- You Created A Machine Learning Application Now Make Sure It's Secure
Review and General Papers
- 50 Years of Test (Un)fairness: Lessons for Machine Learning
- A Comparative Study of Fairness-Enhancing Interventions in Machine Learning
- A Survey Of Methods For Explaining Black Box Models
- A Marauder’s Map of Security and Privacy in Machine Learning
- Challenges for Transparency
- Explaining Explanations: An Overview of Interpretability of Machine Learning
- Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI
- Interpretable Machine Learning: Definitions, Methods, and Applications
- Limitations of Interpretable Machine Learning
- Machine Learning Explainability in Finance
- On the Art and Science of Machine Learning Explanations
- On the Responsibility of Technologists: A Prologue and Primer
- Please Stop Explaining Black Box Models for High-Stakes Decisions
- The Mythos of Model Interpretability
- Towards A Rigorous Science of Interpretable Machine Learning
- The Security of Machine Learning
- Techniques for Interpretable Machine Learning
- Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda
Teaching Resources
- An Introduction to Data Ethics
- Fairness in Machine Learning
- Human-Center Machine Learning
- Practical Model Interpretability
- Trustworthy Deep Learning
Interpretable ("Whitebox") or Fair Modeling Packages
C/C++
Python
- Bayesian Case Model
- Bayesian Ors-Of-Ands
- Bayesian Rule List (BRL)
- Explainable Boosting Machine (EBM)/GA2M
- fair-classification
- Falling Rule List (FRL)
- H2O-3
- Optimal Sparse Decision Trees
- Monotonic XGBoost
- pyGAM
- pySS3
- Risk-SLIM
- Scikit-learn
- sklearn-expertsys
- skope-rules
- Super-sparse Linear Integer models (SLIMs)
- tensorflow/lattice
- This Looks Like That