fatf.utils.data.augmentation.LocalSphere

class fatf.utils.data.augmentation.LocalSphere(dataset: numpy.ndarray, categorical_indices: Optional[List[Union[str, int]]] = None, int_to_float: bool = True)[source]

Sampling data in a hyper-sphere around the selected data point.

New in version 0.0.2.

LocalSphere implements an adapted version of the local fidelity sampling method introduced by [LAUGEL2018DEFINING]. For a specific data point, it samples uniformly within a hyper-sphere, which radius corresponds to a specified percentage of the maximum l-2 distance between the specified data point and all the other instances in the input dataset.

Note

Categorical features.

This augmenter does not currently support data sets with categorical features.

For additional parameters, attributes, warnings and exceptions raised by this class please see the documentation of its parent class: fatf.utils.data.augmentation.Augmentation.

LAUGEL2018DEFINING

Laugel, T., Renard, X., Lesot, M. J., Marsala, C., & Detyniecki, M. (2018). Defining locality for surrogates in post-hoc interpretablity. Workshop on Human Interpretability for Machine Learning (WHI) – International Conference on Machine Learning, 2018.

Raises
NotImplementedError

Some of the features in the data set are categorical – this feature type is not supported at present.

Methods

sample(data_row, numpy.void], …)

Samples new data in a hyper-sphere around the selected data point.

sample(data_row: Union[numpy.ndarray, numpy.void], fidelity_radius_percentage: int = 5, samples_number: int = 50) → numpy.ndarray[source]

Samples new data in a hyper-sphere around the selected data point.

For the additional description of the parameters, warnings and errors please see the documentation of the fatf.utils.data.augmentation.Augmentation.sample method in the parent fatf.utils.data.augmentation.Augmentation class.

Parameters
fidelity_radius_percentageinteger, optional (default=5)

The percentage of the maximum distance between the input data_row and all of the points in the dataset (provided when initialising this class), which will determine the radius of the hyper-sphere used for sampling uniformly around the data_row.

Returns
samplesnumpy.ndarray

A numpy array of shape [samples_number, number of features] holding the sampled data.

Raises
NotImplementedError

The data_row is None – sampling from the mean of the dataset used to initialise this class is not yet implemented.

TypeError

The fidelity_radius_percentage parameter is not an integer.

ValueError

The fidelity_radius_percentage parameter is not a positive integer.