fatf.transparency.predictions.surrogate_explainers.TabularBlimeyLime

class fatf.transparency.predictions.surrogate_explainers.TabularBlimeyLime(dataset: numpy.ndarray, predictive_model: object, as_regressor: bool = False, categorical_indices: Optional[List[Union[str, int]]] = None, class_names: Optional[List[str]] = None, feature_names: Optional[List[str]] = None)[source]

A tabular LIME explainer – a surrogate explainer based on a linear model.

Changed in version 0.1.0: (1) Added support for regression models. (2) Changed the feature selection mechanism from k-LASSO to forward_selection when the number of selected features is less than 7, and highest_weights otherwise – the default LIME behaviour.

New in version 0.0.2.

This class implements Local Interpretable Model-agnostic Explanations (LIME) introduced by [RIBEIRO2016WHY]. This implementation mirrors the one in the official LIME package, which is available under the lime.lime_tabular.LimeTabularExplainer class therein.

This explainer uses a quartile discretiser (fatf.utils.data.discretisation.QuartileDiscretiser) and a normal sampler (fatf.utils.data.augmentation.NormalSampling) for augmenting the data. The following steps are taken to generate the explanation (when the explain_instance method is called):

  • The input data_row is discretised using the quartile discretiser. The numerical features are binned and the categorical ones are left unchanged (selected via the categorical_indices parameter).

  • The data are sampled around the discretised data_row using the normal sampler. Since after the discretisation all of the features are categorical the bin indices are sampled based on their frequency in (the discretised version of) the dataset used to initialise this class.

  • The sampled data are reverted back to their original domain and predicted with the black-box model (predictive_model used to initialise this class). This step is done via sampling each (numerical) feature value from the corresponding bin using the truncated normal distribution for which minimum (lower threshold), maximum (upper threshold), mean and standard deviation are computed empirically from all the data points from the dataset for which feature values fall into that bin. The categorical features are left unchanged.

  • The discretised sampled data set is binarised by comparing each row with the user-specified data_row (in the explain_instance method). This step is performed by taking XNOR logical operation between the two – 1 if the feature value is the same in a row of the discretised data set and the data_row and 0 if it is different.

  • The Euclidean distance between the binarised sampled data and binarised data_row is computed and passed through an exponential kernel (fatf.utils.kernels.exponential_kernel) to get similarity scores, which will be used as data point weights when reducing the number of features (see below) and training the linear regression.

  • To limit the number of features in the explanation (if enabled by the user) we either use forward selection when the number of selected features is less than 7 or highest weights otherwise. (This is controlled by the features_number parameter in the explain_instance method and by default – features_number=None – all of the feature are used.)

  • A local (weighted) ridge regression (sklearn.linear_model.Ridge) is fitted to the sampled and binarised data with the target being:

    • The numerical predictions of the black-box model when the underlying model is a regression.

    • A vector of probabilities output by the black-box model for the selected class (one-vs-rest) when the underlying model is a probabilistic classifier. By default, one model is trained for all of the classes (explained_class=None in the explain_instance method), however the class to be explained can be specified by the user.

Note

How to interpret the results?

Because the local surrogate model is trained on the binarised sampled data that is parsed through the XNOR operation, the parameters extracted from this model (feature importances) should be interpreted as an answer to the following question:

“Had this particular feature value of the explained data point been outside of this range (for numerical features) or had a different value (for categorical feature), how would that influence the probability of this point belonging to the explained class (probabilistic classification) / predicted numerical value (regression)?”

This LIME implementation is limited to black-box probabilistic classifiers and regressors (similarly to the official implementation). Therefore, the predictive_model must have a predict_proba method for probabilistic models and predict method for regressors. When the surrogate is built for a probabilistic classifier, the local model will be trained using the one-vs-rest approach since the output of the global model is an array with probabilities of each class (the classes to be explained can be selected using the explained_class parameter in the explain_instance method). The column indices indicated as categorical features (via the categorical_indices parameter) will not be discretised.

For detailed instructions on how to build a custom surrogate explainer (to avoid tinkering with this class) please see the How to build LIME yourself (bLIMEy) – Surrogate Tabular Explainers how-to guide.

For additional parameters, warnings and errors description please see the documentation of the parent class fatf.transparency.predictions.surrogate_explainers.SurrogateTabularExplainer.

RIBEIRO2016WHY

Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). ACM.

Parameters
datasetnumpy.ndarray

A 2-dimensional numpy array with a dataset (utilised in various ways throughout the explainer).

predictive_modelobject

A pre-trained (black-box) predictive model to be explained. If as_probabilistic (see below) is set to True, it must have a predict_proba method that takes a data set as the only required input parameter and returns a 2-dimensional numpy array with probabilities of belonging to each class. Otherwise, if as_probabilistic is set to False, the predictive_model must have a predict method that outputs a 1-dimensional array with (class) predictions.

as_regressorboolean, optional (default=False)

New in version 0.1.0.

A boolean indicating whether the global model should be treated as regression (True) or probabilistic classification (False).

categorical_indicesList[column indices], optional (default=None)

A list of column indices in the input dataset that should be treated as categorical features.

class_namesList[string], optional (default=None)

A list of strings defining the names of classes. If the predictive model is probabilistic, the order of the class names should correspond to the order of columns output by the model. For other models the order should correspond to lexicographical ordering of all the possible outputs of this model. For example, if the model outputs ['a', 'c', '0'] the class names should be given for ['0', 'a', 'c'] ordering.

feature_namesList[string], optional (default=None)

A list of strings defining the names of the dataset features. The order of the names should correspond to the order of features in the dataset.

Attributes
discretiserfatf.utils.data.discretisation.Discretiser

An instance of the quartile discretiser (fatf.utils.data.discretisation.QuartileDiscretiser) initialised with the input dataset and used to discretise the data_row when the explain_instance method is called.

augmenterfatf.utils.data.augmentation.Augmentation

An instance of the normal sampling augmenter (fatf.utils.data.augmentation.NormalSampling) used to sample new data points around the discretised data_row (in the explain_instance method).

bin_sampling_valuesDictionary[dataset column index, Dictionary[discretised bin id, Tuple(float, float, float, float)]]

A dictionary holding characteristics for each bin of each numerical feature. The characteristic are represented as a 4-tuple consisting of: the lower bin boundary, the upper bin boundary, the empirical mean of of all the values of this feature for data points (in dataset) falling into that bin, and the empirical standard deviation (calculated in the same way). For the edge bins, if there are data available the lower edge is calculated empirically (as the minimum of the corresponding feature values falling into that bin), otherwise it is set to -numpy.inf. The same applies to the upper edge, which is either set to numpy.inf or calculated empirically (as the maximum of the corresponding feature values falling into that bin). If there are no data points to calculate the mean and standard deviation for a given bin, these two values are set to numpy.nan. (This does not influence the future reverse sampling, for which this attribute is used: since there were no data for a given bin, the frequency of data for that bin is 0, therefore no data falling into this bin will be sampled.)

Raises
ImportError

The scikit-learn package is missing.

Methods

explain_instance(data_row, numpy.void], …)

Explains the data_row with linear regression feature importance.

explain_instance(data_row: Union[numpy.ndarray, numpy.void], explained_class: Union[int, str, None] = None, samples_number: int = 50, features_number: Optional[int] = None, kernel_width: Optional[float] = None, return_models: bool = False) → Union[Dict[str, Dict[str, float]], Tuple[Dict[str, Dict[str, float]], Union[Dict[str, fatf.utils.models.models.Model], fatf.utils.models.models.Model]]][source]

Explains the data_row with linear regression feature importance.

Changed in version 0.1.0: Changed the feature selection mechanism from k-LASSO to forward_selection when the number of selected features is less than 7, and highest_weights otherwise – the default LIME behaviour.

For probabilistic classifiers the explanations will be produced for all of the classes by default. This can be changed by selecting a specific class with the explained_class parameter.

The default kernel_width is computed as the square root of the number of features multiplied by 0.75. Also, by default, all of the (interpretable) features will be used to create an explanation, which can be limited by setting the features_number parameter. The data sampling around the data_row can be customised by specifying the number of points to be generated (samples_number).

By default, this method only returns feature importance, however by setting return_models to True, it will also return the local linear surrogates for further analysis and processing done outside of this method.

Note

The exact description of the explanation generation procedure can be found in the documentation of this class (fatf.transparency.predictions.surrogate_explainers.TabularBlimeyLime).

For additional parameters, warnings and errors please see the parent class method fatf.transparency.predictions.surrogate_explainers.SurrogateTabularExplainer.explain_instance.

Parameters
data_rowUnion[numpy.ndarray, numpy.void]

A data point to be explained (1-dimensional numpy array).

explained_classUnion[integer, string], optional (default=None)

The class to be explained – only applicable to probabilistic classifiers. If None, all of the classes will be explained. This can either be the index of the class (the column index of the probabilistic vector) or the class name (taken from self.class_names).

samples_numberinteger, optional (default=50)

The number of data points sampled from the normal augmenter, which will be used to fit the local surrogate model.

features_numberinteger, optional (default=None)

The maximum number of (interpretable) features – found with forward selection or highest weights – to be used in the explanation (the local surrogate model is trained with this feature subset). By default (None), all of the (interpretable) features are used.

kernel_widthfloat, optional (default=None)

The width of the exponential kernel used when computing weights of the sampled data based on the distances between the sampled data and the data_row.The default kernel_width (kernel_width=None) is computed as the square root of the number of features multiplied by 0.75.

return_modelsboolean, optional (default=False)

If True, this method will return both the feature importance explanation dictionary and a dictionary holding the local models. Otherwise, only the first dictionary will be returned.

Returns
explanationsDictionary[string, Dictionary[string, float]]

A dictionary holding dictionaries that contain feature importance – where the feature names are taken from self.feature_names and the feature importances are extracted from local linear surrogates. These dictionaries are held under keys corresponding to class names (taken from self.class_names).

modelssklearn.linear_model.base.LinearModel, optional

A dictionary holding locally fitted surrogate linear models held under class name keys (taken from self.class_names). This dictionary is only returned when the return_models parameter is set to True.

Raises
TypeError

The explained_class parameter is neither None, an integer or a string. The samples_number parameter is not an integer. The features_number parameter is neither None nor an integer. The kernel_width parameter is neither None nor a number. The return_models parameter is not a boolean.

ValueError

The samples_number parameter is a non-positive integer (smaller than 1). The features_number parameter is a non-positive integer (smaller than 1). The kernel_width parameter is a non-positive number (smaller or equal to 0). The explained_class specified by the user could neither be recognised as one of the allowed class names (self.class_names) nor an index of a class name.

Examples using fatf.transparency.predictions.surrogate_explainers.TabularBlimeyLime