Roadmap¶
The following list of milestones is to guide the core developers on the future direction of the package development. The list is by no means exhaustive and will be updated over time as the development progresses and new algorithms are proposed by the research community.
The list is algorithm and featureoriented as the goal of the package is to give the community access to a tool that has all the necessary functionality for FAT research and deployment.
Milestone 1 ✔¶
The first milestone is our first public release of the package – version 0.0.1. The following functionality should be available.
Fairness 
Accountability 
Transparency 

Data/ Features 



Models 



Predictions 


Milestone 2¶
This will be the first major update of the package. The focus will be placed on the transparency module. Nevertheless, some additional fairness and accountability functionality will be implemented as well.
Fairness 
Accountability 
Transparency 

Data/ Features 


Models 



Predictions 

Extra fairness metrics.
Implement additional groupbased fairness metrics.
Implement threshold computation based on the selected group metric equality.
Implement Jupyter Notebook interactive plugins (widgets) to allow the community to play with the fairness concepts. (E.g., widgets similar to interactive figures in this Google blog post.
Merge the pull request with kanonimity, ldiversity and tcloseness.
Implement Background Check.
PD and ICE enhancements (pull request).
2D implementation.
Implementation for classification and regression.
Improved visualisations.
Scikitlearn model explainers (cf. the reference implementation in the eli5 package).
Decision trees.
Feature importance.
Decision tree structure (tree plot).
Rule lists and sets (these can share a common representation with the trees).
Rule list structure (rule list in a text form).
Linear models.
Feature importance (coefficients).
Kmeans.
Prototypes.
Similarities between examples in a cluster that are correctly assigned to this clusetr.
Criticisms.
Similarities between examples in a cluster that are incorrectly assigned to this clusetr.
Implement ANCHOR.
Implement forestspy.
Implement Tree Interpreter.
“The global feature importance of random forests can be quantified by the total decrease in node impurity averaged over all trees of the ensemble (‘mean decrease impurity’).”
“We can use the difference between the mean value of data points in a parent node between that of a child node to approximate the contribution of this split…”
Interpreting random forests and Random forest interpretation with scikitlearn blog posts hold some useful information extracted from the “Interpreting random forests” paper by Ando Saabas.
Implement a variety of feature importance metrics.
Random forest feature (variable) importance (“Random Forests”, Leo Breiman, 2001). (Similar to permutation importance.)
XGboost feature importance.
Feature weight – the number of times a feature appears in a tree (ensemble).
Gain – the average gain of splits that use the feature.
Coverage – the average coverage (number of samples affected) of splits that use the feature.
Prediction variance – mean absolute value of changes in predictions given perturbations in the data.
“Variable Importance Analysis: A Comprehensive Review”. Reliability Engineering & System Safety 142 (2015): 399432; Wei, Pengfei, Zhenzhou Lu, and Jingwen Song.
Scikitlearn and eli5 permutation importance (a.k.a. Mean Decrease Accuracy (MDA)).
(These may be sensitive to features being correlated – a user guide note should suffice.)
Implement model reliance (Fisher, 2018). (“All Models are Wrong but many are Useful: Variable Importance for BlackBox, Proprietary, or Misspecified Prediction Models, using Model Class Reliance”, Aaron Fisher, Cynthia Rudin, Francesca Dominici.)
Implement TREPAN (tree surrogate).
“Extracting Comprehensible Models From Trained Neural Networks”, Mark W. Craven(1996). (PhD thesis)
“Extracting TheeStructured Representations of Trained Networks”, Mark W. Craven and Jude W. Shavlik (NIPS, 96). (NIPS paper)
“Study of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA”, Nikita Patel and Saurabh Upadhyay (2012). (report)
TREPAN implementation in Skater.
Implement a counterfactual explainer for logical models and their ensembles.
Scikitlearn prediction explainers.
Decision trees.
Roottoleaf path (logical conditions).
Counterfactuals.
Rule lists and sets.
Logical conditions list (as text).
Neighbours.
Similarities and differences (on the feature vector) among the neighbours of the same and the opposite class.
Kmeans.
Prototypes.
Nearest centroid of the same class.
Criticisms.
Nearest centroid of the opposite class.
bLIMEy implementation.
Fresh LIME implementation.
Write tutorials similar to LIME tutorials, in particular this tutorial.
Have a look at what eli5 does: “eli5.lime provides dataset generation utilities for text data (remove random words) and for arbitrary data (sampling using Kernel Density Estimation) … for explaining predictions of probabilistic classifiers eli5 uses another classifier by default, trained using crossentropy loss, while canonical library fits regression model on probability output.”
Milestone 3¶
The third milestone will integrate the tool with important machine learning and fairness packages.
Fairness 
Accountability 
Transparency 

Data/ Features 

Models 



Predictions 

Integration or reimplementation of fairness360 package (depending on the
code quality).
Implement distribution shift metrics.
Implement calibration techniques.
Integration with the SHAP package.
Explainers for models implemented in the Xgboost package.
Explainers for models implemented in the LightGBM package.
Explainers for models implemented in the Lightning package.
Explainers for models implemented in the sklearncrfsuite package.
eli5 integration. (“Text processing utilities from scikitlearn and can highlight text data accordingly. Pipeline and FeatureUnion are supported. It also allows to debug scikitlearn pipelines which contain HashingVectorizer, by undoing hashing.”)
Implement Bayesian Rule Lists (BRL).
Bayesian Rule Lists (BRL).
Example BRL use case: “Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model”, Letham et.al(2015).
Scalable Bayesian Rule Lists (SBRL).
“Scalable Bayesian Rule Lists”, Yang et.al (2016). (SBRL paper)
Bayesian Rule List Classifier (BRLC) is a Python wrapper for the SBRL.
Big Data Bayesian Rule List Classifier (BigDataBRLC) is a BRLC to handle large datasets.
PD/ICE speed improvements – parallelisation and a progress bar.
iPython/Jupyter Notebook interactive (JS) plots to improve research applicability aspect of the package.
Milestone 4¶
This milestone is focused on implementing in the package a collection of tools that will enable researchers and practitioners to use it with (deep) neural networks (Deep Learning, autograd, optimisation).
Fairness 
Accountability 
Transparency 

Data/ Features 

Models 



Predictions 

Integration with the whatif tool.
Implement Quantitative Input influence (QII).
Implement epsilonLayerwise Relevance Propagation (eLRP).
“On PixelWise Explanations for NonLinear Classifier Decisions by LayerWise Relevance Propagation”, Bach S, Binder A, Montavon G, Klauschen F, Muller KR, Samek W (2015).
“Towards better understanding of gradientbased attribution methods for Deep Neural Networks”, Ancona M, Ceolini E, Oztireli C, Gross M (ICLR, 2018).
Implement occlusion.
“Visualizing and understanding convolutional networks”, Zeiler, M and Fergus, R (Springer, 2014).
Implement Integrated Gradient method.
“Axiomatic Attribution for Deep Networks”, Sundararajan, M, Taly, A, Yan, Q (ICML, 2017).
Implement the DeepLIFT algorithm.
Implement the DeepExplain algorithm.
“Towards better understanding of gradientbased attribution methods for Deep Neural Networks”, Ancona M, Ceolini E, Oztireli C, Gross M (ICLR, 2018).
Finalise full integration of Skater and SHAP (deep neural netowrks).