# How to build LIME yourself (bLIMEy) – Surrogate Tabular Explainers¶

How-to Guide Contents

This how-to guide illustrates how to construct a local surrogate model on top of a black-box model and use it to generate explanations of selected predictions of the black-box model.

This how-to guide requires `scikit-learn`

package as it uses ridge
regression and decision tree predictors (implemented therein) as local
surrogate models.

Each surrogate explainer is composed of three main parts:

interpretable data representation;

data sampling; and

explanation generation.

Choosing a particular algorithm for each of these components shapes the type
of surrogate explanations that can be generated with the final explainer.
(The theoretical considerations for each component can be found in
Surrogate Transparency User Guide, [SOKOL2019BLIMEY] and the
Jupyter Notebook distributed with the latter manuscript.)
In this how-to guide we will show how to build the tabular LIME explainer
[RIBEIRO2016WHY] (with fixed sampling procedure [SOKOL2019BLIMEY] and
the sampling algorithm replaced with MixuP –
`fatf.utils.data.augmentation.Mixup`

) and a simple tree-based surrogate.

Two similar surrogate explainer are already distributed with this package:
`fatf.transparency.predictions.surrogate_explainers.TabularBlimeyLime`

and
`fatf.transparency.predictions.surrogate_explainers.TabularBlimeyTree`

.
However, the LIME explainer implementation is the exact replica of its official
implementation, hence it does the “reverse sampling”, which introduces
randomness to the explainer. Both of these classes provide usage convenience
– no need to build the explainers from scratch – in exchange for lack of
flexibility – none of the three aforementioned components can be customised.

Note

Deploying Surrogate Explainers

You may want to consider using the abstract `fatf.transparency.predictions.surrogate_explainers.SurrogateTabularExplainer`

class to implement a custom surrogate explainer for tabular data. This
abstract class implements a series of input validation steps and internal
attribute computation that make implementing a custom surrogate considerably
easier.

- SOKOL2019BLIMEY(1,2)
Sokol, K., Hepburn, A., Santos-Rodriguez, R. and Flach, P., 2019. bLIMEy: Surrogate Prediction Explanations Beyond LIME. 2019 Workshop on Human-Centric Machine Learning (HCML 2019). 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. arXiv preprint arXiv:1910.13016. URL https://arxiv.org/abs/1910.13016.

- RIBEIRO2016WHY
Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). ACM.

## Setup¶

First, let us set the random seed to ensure reproducibility of the results:

```
>>> import fatf
>>> fatf.setup_random_seed(42)
```

We will also need `numpy`

:

```
>>> import numpy as np
```

Next, we need to load the IRIS data set, which we will use for this how-to guide:

```
>>> import fatf.utils.data.datasets as fatf_datasets
>>> iris_data_dict = fatf_datasets.load_iris()
>>> iris_data = iris_data_dict['data']
>>> iris_target = iris_data_dict['target']
>>> iris_feature_names = iris_data_dict['feature_names'].tolist()
>>> iris_target_names = iris_data_dict['target_names'].tolist()
```

Now, we will train a black-box model – k-nearest neighbours predictor:

```
>>> import fatf.utils.models.models as fatf_models
>>> blackbox_model = fatf_models.KNN(k=3)
>>> blackbox_model.fit(iris_data, iris_target)
```

and compute its training set accuracy:

```
>>> import sklearn.metrics
>>> predictions = blackbox_model.predict(iris_data)
>>> sklearn.metrics.accuracy_score(iris_target, predictions)
0.96
```

As you can see, the IRIS dataset is reasonably easy for the k-NN classifier and it achieves a high accuracy. Next, we need to choose a data point for which we will generate an explanation with respect to this model:

```
>>> data_point = iris_data[0]
>>> data_point
array([5.1, 3.5, 1.4, 0.2], dtype=float32)
>>> data_point_probabilities = blackbox_model.predict_proba(
... data_point.reshape(1, -1))[0]
>>> data_point_probabilities
array([1., 0., 0.])
>>> data_point_prediction = data_point_probabilities.argmax(axis=0)
>>> data_point_prediction
0
>>> data_point_class = iris_target_names[data_point_prediction]
>>> data_point_class
'setosa'
```

Let us visualise where the `data_point`

lies in the data set by plotting the
last two dimensions of the data and highlighting the `data_point`

:

```
>>> import matplotlib.pyplot as plt
>>> iris_feature_names[2:]
['petal length (cm)', 'petal width (cm)']
>>> _ = plt.figure()
>>> _ = plt.scatter(
... iris_data[1:50, 2],
... iris_data[1:50, 3],
... label=iris_target_names[0])
>>> _ = plt.scatter(
... iris_data[50:100, 2],
... iris_data[50:100, 3],
... label=iris_target_names[1])
>>> _ = plt.scatter(
... iris_data[100:150, 2],
... iris_data[100:150, 3],
... label=iris_target_names[2])
>>> _ = plt.scatter(
... data_point[2],
... data_point[3],
... label='Explained Data Point',
... s=200, c='k')
>>> _ = plt.xlabel(iris_feature_names[2])
>>> _ = plt.ylabel(iris_feature_names[3])
>>> _ = plt.legend()
```

## Surrogate Linear Model (LIME)¶

We will use the quartile discretisation for the
*interpretable data representation*:

```
>>> import fatf.utils.data.discretisation as fatf_discretisation
>>> discretiser = fatf_discretisation.QuartileDiscretiser(
... iris_data,
... feature_names=iris_feature_names)
```

Mixup for *data sampling*:

```
>>> import fatf.utils.data.augmentation as fatf_augmentation
>>> augmenter = fatf_augmentation.Mixup(iris_data, ground_truth=iris_target)
```

and a ridge regression for *explanation generation*:

```
>>> import sklearn.linear_model
>>> lime = sklearn.linear_model.Ridge()
```

### Data Augmentation¶

First, we will sample new data in the neighbourhood of the selected
`data_point`

, predict them with the black-box model and plot them:

```
>>> sampled_data = augmenter.sample(data_point, samples_number=50)
>>> sampled_data_probabilities = blackbox_model.predict_proba(sampled_data)
>>> sampled_data_predictions = sampled_data_probabilities.argmax(axis=1)
>>> sampled_data_0_indices = np.where(sampled_data_predictions == 0)[0]
>>> sampled_data_1_indices = np.where(sampled_data_predictions == 1)[0]
>>> sampled_data_2_indices = np.where(sampled_data_predictions == 2)[0]
>>> _ = plt.figure()
>>> _ = plt.scatter(
... iris_data[1:50, 2],
... iris_data[1:50, 3],
... label=iris_target_names[0])
>>> _ = plt.scatter(
... iris_data[50:100, 2],
... iris_data[50:100, 3],
... label=iris_target_names[1])
>>> _ = plt.scatter(
... iris_data[100:150, 2],
... iris_data[100:150, 3],
... label=iris_target_names[2])
>>> _ = plt.scatter(
... data_point[2],
... data_point[3],
... label='Explained Data Point: {}'.format(data_point_class),
... s=200,
... c='k')
>>> _ = plt.scatter(
... sampled_data[sampled_data_0_indices, 2],
... sampled_data[sampled_data_0_indices, 3],
... label='Augmented Data: {}'.format(iris_target_names[0]))
>>> _ = plt.scatter(
... sampled_data[sampled_data_1_indices, 2],
... sampled_data[sampled_data_1_indices, 3],
... label='Augmented Data: {}'.format(iris_target_names[1]))
>>> _ = plt.scatter(
... sampled_data[sampled_data_2_indices, 2],
... sampled_data[sampled_data_2_indices, 3],
... label='Augmented Data: {}'.format(iris_target_names[2]))
>>> _ = plt.xlabel(iris_feature_names[2])
>>> _ = plt.ylabel(iris_feature_names[3])
>>> _ = plt.legend()
```

In case of LIME we use the probabilistic output of the black-box classifier as
the local model – ridge regression – is fitted against the probabilities of
a selected class. When using any other model (cf. the decision tree surrogate
section below) it is possible to use class predictions instead.
Using the probabilistic output of the black-box model also entails training the
local model as one-vs-rest for a selected class to be explained. In this case
we will explain the class to which the selected `data_point`

belongs:
`'setosa'`

.

### Interpretable Representation¶

LIME introduces an explicit interpretable representation – discretisation of
continuous features – to improve comprehensibility of explanations. This step
may not be necessary for other choices of local surrogates (cf. the decision
tree surrogate section below) but for LIME it allows the explanation to
indicate how moving the data point out of each of the discretised bins would
affect the prediction. The exact steps taken by LIME are described in the
documentation of the
`fatf.transparency.predictions.surrogate_explainers.TabularBlimeyLime`

class.

First, we transform the selected `data_point`

and the data sampled around it
into the interpretable representation, i.e., we discretise them:

```
>>> data_point_discretised = discretiser.discretise(data_point)
>>> sampled_data_discretised = discretiser.discretise(sampled_data)
```

Next, we create a new representation of the discretised data, which indicates
whether for each discretised feature of the sampled data whether it is the same
as the bin to which the `data_point`

belongs or not:

```
>>> import fatf.utils.data.transformation as fatf_transformation
>>> sampled_data_binarised = fatf_transformation.dataset_row_masking(
... sampled_data_discretised, data_point_discretised)
```

Let us show how this affects the first sampled data point:

```
>>> data_point_discretised
array([0, 3, 0, 0], dtype=int8)
>>> sampled_data_discretised[0, :]
array([1, 3, 0, 0], dtype=int8)
>>> sampled_data_binarised[0, :]
array([0, 1, 1, 1], dtype=int8)
```

### Explanation Generation¶

Finally, we train a local linear (ridge) regression to the locally sampled,
discretised and binarised data and extract the explanation from its
coefficient. To enforce the locality of the explanation even further, we first
calculate the distances between the binarised `data_point`

and the sampled
data and kernelise these distances (with an exponential kernel) to get data
point weights. We use the \(0.75 * \sqrt{\text{features number}}\) as the
kernel width:

```
>>> import fatf.utils.distances as fatf_distances
>>> import fatf.utils.kernels as fatf_kernels
>>> features_number = sampled_data_binarised.shape[1]
>>> kernel_width = np.sqrt(features_number) * 0.75
>>> distances = fatf_distances.euclidean_point_distance(
... np.ones(features_number), sampled_data_binarised)
>>> weights = fatf_kernels.exponential_kernel(
... distances, width=kernel_width)
```

We use `np.ones(...)`

here as it is equivalent to binarising the
`data_point`

against itself:

```
>>> fatf_transformation.dataset_row_masking(
... data_point_discretised.reshape(1, -1), data_point_discretised)
array([[1, 1, 1, 1]], dtype=int8)
```

As mentioned before, we will explain the `'setosa'`

class, which has index
`0`

:

```
>>> iris_target_names.index('setosa')
0
```

Therefore, we extract the probabilities of the first column (with index `0`

)
from the black-box predictions:

```
>>> sampled_data_predictions_setosa = sampled_data_probabilities[:, 0]
```

Next, we do weighted feature selection to introduce sparsity to the explanation. To this end, we use k-LASSO and select 2 features with it:

```
>>> import fatf.utils.data.feature_selection.sklearn as fatf_feature_ssk
>>> lasso_indices = fatf_feature_ssk.lasso_path(
... sampled_data_binarised, sampled_data_predictions_setosa, weights, 2)
```

Now, we prepare the binarised data set for training the surrogate ridge regression by extracting the features chosen with lasso:

```
>>> sampled_data_binarised_2f = sampled_data_binarised[:, lasso_indices]
```

and retrieve the names of these two binary features (in the interpretable representation):

```
>>> interpretable_feature_names = []
>>> for feature_index in lasso_indices:
... bin_id = data_point_discretised[feature_index].astype(int)
... interpretable_feature_name = (
... discretiser.feature_value_names[feature_index][bin_id])
... interpretable_feature_names.append(interpretable_feature_name)
>>> interpretable_feature_names
['*petal length (cm)* <= 1.60', '*petal width (cm)* <= 0.30']
```

Last but not least, we train a local weighted ridge regression:

```
>>> lime.fit(
... sampled_data_binarised_2f,
... sampled_data_predictions_setosa,
... sample_weight=weights)
Ridge()
```

and explain the `data_point`

with its coefficients:

```
>>> for name, importance in zip(interpretable_feature_names, lime.coef_):
... print('->{}<-: {}'.format(name, importance))
->*petal length (cm)* <= 1.60<-: 0.4297609038698995
->*petal width (cm)* <= 0.30<-: 0.37901863586706086
```

This explanation agrees with our intuition as based on the data scatter plot if the petal length (x-axis) is larger than 1.6, we are moving outside of the blue cluster (setosa) and if petal width (y-axis) is larger than 0.3, we are also moving outside of the blue cluster.

We leave explaining the other classes as an exercise for the reader.

## Surrogate Tree¶

A linear regression fitted as one-vs-rest to probabilities of a selected class is not the only surrogate that can give us some insights into the black-box model operations. Next, we train a shallow local decision tree.

Since a decision tree can learn its own interpretable representation – the
feature splits – we can use the sampled data in its original domain to train
the surrogate tree. Furthermore, by limiting the depth of the tree we force it
to do feature selection, hence no need for an auxiliary dimensionality
reduction. To this end, we just need to compute weights between the sampled
data and the `data_point`

in this domain:

```
>>> features_number = sampled_data.shape[1]
>>> kernel_width = np.sqrt(features_number) * 0.75
>>> distances = fatf_distances.euclidean_point_distance(
... data_point, sampled_data)
>>> weights = fatf_kernels.exponential_kernel(
... distances, width=kernel_width)
```

Lastly, we need to decide whether we want to train the tree as a regressor for probabilities of one of the classes (as with LIME) or use a classification tree. We will go with the latter option. Now, we have a choice between training the tree as a multi-class classifier for all of the classes or as one-vs-rest for a selected class. The advantage of the former is that the same tree can be used to explain all of the classes at once, therefore we will go with a multi-class classification tree:

```
>>> import sklearn.tree
>>> blimey_tree = sklearn.tree.DecisionTreeClassifier(max_depth=3)
>>> blimey_tree.fit(
... sampled_data, sampled_data_predictions, sample_weight=weights)
DecisionTreeClassifier(max_depth=3)
```

One possible explanation that we can extract from the tree is feature importance:

```
>>> for n_i in zip(iris_feature_names, blimey_tree.feature_importances_):
... name, importance = n_i
... print('->{}<-: {}'.format(name, importance))
->sepal length (cm)<-: 0.0
->sepal width (cm)<-: 0.0057061683826981156
->petal length (cm)<-: 0.008758435540648965
->petal width (cm)<-: 0.9855353960766529
```

This explanation agrees with LIME but is not as informative as the one derived with LIME. A better explanation is the tree structure itself:

```
>>> blimey_tree_text = sklearn.tree.export_text(
... blimey_tree, feature_names=iris_feature_names)
>>> print(blimey_tree_text)
|--- petal width (cm) <= 0.71
| |--- class: 0
|--- petal width (cm) > 0.71
| |--- petal length (cm) <= 4.58
| | |--- class: 1
| |--- petal length (cm) > 4.58
| | |--- sepal width (cm) <= 2.91
| | | |--- class: 2
| | |--- sepal width (cm) > 2.91
| | | |--- class: 1
```

Let us recall the sampled data:

```
>>> _ = plt.figure()
>>> _ = plt.scatter(
... sampled_data[sampled_data_0_indices, 2],
... sampled_data[sampled_data_0_indices, 3],
... label='Augmented Data: {}'.format(iris_target_names[0]))
>>> _ = plt.scatter(
... sampled_data[sampled_data_1_indices, 2],
... sampled_data[sampled_data_1_indices, 3],
... label='Augmented Data: {}'.format(iris_target_names[1]))
>>> _ = plt.scatter(
... sampled_data[sampled_data_2_indices, 2],
... sampled_data[sampled_data_2_indices, 3],
... label='Augmented Data: {}'.format(iris_target_names[2]))
>>> _ = plt.scatter(
... data_point[2],
... data_point[3],
... label='Explained Data Point: {}'.format(data_point_class),
... s=200,
... c='k')
>>> _ = plt.xlabel(iris_feature_names[2])
>>> _ = plt.ylabel(iris_feature_names[3])
>>> _ = plt.legend()
```

Clearly, the first split *petal width (cm) <= 0.71*, which is on the y-axis
is enough to separate the blue cloud (setosa) from the other two classes and
the *petal length (cm) <= 4.58* split for petal width > 0.71 is the best we
can do to separate the orange and green clouds. Had we sampled more data,
the local surrogate would have better approximated the local decision boundary
of the black-box model. We leave further experiments in this direction to the
reader.