fatf.transparency.sklearn.tools.SKLearnExplainer¶
-
class
fatf.transparency.sklearn.tools.SKLearnExplainer(clf: sklearn.base.BaseEstimator, feature_names: Optional[List[str]] = None, class_names: Optional[List[str]] = None)[source]¶ Implements a base scikit-learn model explainer class.
New in version 0.0.2.
Every scikit-learn model explainer class should inherit from this class. It should also overwrite the following four private methods:
For their expected functionality please see their respective documentation.
The explainer should also implement one of the explanatory methods that are inherited from
SKLearnExplainer’s parent class (fatf.utils.transparency.explainers.Explainer):explain_model, and/or
Alternatively, a new method that explains an aspect of the model or its predictions can be introduced.
This class loggs an information if the feature names were not given and are inferred from the provided number of features using “feature %d” pattern. An information is also logged if the class names were not given and are inferred from the provided class id’s (using the
classes_arrayattribute) using “class %s” pattern.- Parameters
- clfsklearn.base.BaseEstimator
A scikit-learn model.
- feature_namesOptional[List[string]]
A list of strings representing feature names in order they appear in the numpy array used to train the
clfpredictive model.- class_namesOptional[List[string]]
A list of strings representing class names. The order of this list has to correspond to the lexicographical ordering of the unique values in the target (ground truth) array used to train the
clfpredictor. For example, if your target array has the following values['aa', 'a', '0', 'b'], your class names should be given for the following ordering of the class id’s:['0', 'a', 'aa', 'b'].
- Attributes
- clfsklearn.base.BaseEstimator
A fitted scikit-learn model.
- feature_namesUnion[None, List[string]]
Either
Noneor a list of feature names in the order they appear in the numpy array used to train theclfclassifier.- class_namesUnion[None, List[string]]
Either
Noneor a list of class names in the order of lexicographically sorted unique values in the target (ground truth) array used to train theclfpredictor (class id’s).- is_classifierboolean
If
True, the predictive model held under theclfattribute is a classifier. IfFalse, it is a regressor. (Set using theclfattribute via the_is_classifiermethod.)- features_numberUnion[None, integer]
Either
Noneor the number of features in theclfmodel. (Extracted from theclfattribute with the_get_features_numbermethod.)- classes_arrayUnion[None, numpy.ndarray]
Either
Noneor a 1-dimensional numpy array holding all the possible model predictions (only for classifiers). For regressors this should always beNone.
- Raises
- TypeError
The
clfobject is not a scikit-learn classifier – it does not inherit form thesklearn.base.BaseEstimator.feature_namesparameter is neither a Python list norNone. One of the elements of thefeature_nameslist is not a string. Theclass_namesparameter is neither a Python list norNone. One of the elements of theclass_nameslist is not a string.- ValueError
Either the
feature_namesorclass_nameslist is empty. The length of thefeature_nameslist is different than the features number extracted from the classifier. The length of theclass_nameslist is different than the length of theclasses_arrayextracted from the classifier.
- Warns
- UserWarning
Features number is not given, therefore the length of the features name list cannot be validated. Classes array is not given, therefore the length of class names array cannot be validated.
Methods
Generates an explanation of a single data point (instance).
Generates a model explanation.
Computes feature importance.
map_class(clf_class, str])Maps a class id output by the classifier to a class name.
-
_get_classes_array() → Optional[numpy.ndarray][source]¶ Retrieves the array with classes that the predictive model can output.
For regressors this method must return
None. For classifier it should return a 1-dimensional numpy array that holds all the possible classification results that the model can output if they are possible to extract formself.clforNoneotherwise.- Returns
- features_numberUnion[numpy.ndarray, None]
A 1-dimensional numpy array holding all the possible model predictions (only for classifiers) or
None.
- Raises
- NotImplementedError
This error is always raised since the method is an abstract method.
-
_get_features_number() → Optional[int][source]¶ Returns the number of features that the model accepts or
None.If it is possible to extract the number of features (columns) expected by the
self.clfpredictor, this method should return this number. Otherwise, it must returnNone.- Returns
- features_numberUnion[integer, None]
The number of features accepted by the classifier or
None.
- Raises
- NotImplementedError
This error is always raised since the method is an abstract method.
-
_is_classifier() → bool[source]¶ Indicates whether the
clfmodel is a classifier or a regressor.This method should return
Trueif the model that this class explains is a classifier andFalseif it is a regressor.- Returns
- is_classifierboolean
Trueif theself.clfmodel is a classifier orFalsewhen it is a regressor.
- Raises
- NotImplementedError
This error is always raised since the method is an abstract method.
-
_validate_kind_fitted() → bool[source]¶ Implements a kind check and a fit check of a predictive model.
This method is called upon initialising the class and checks whether the
self.clfpredictor is of the right kind. For example, when implementing an explainer for scikit-learn linear models this method should check whether theself.clfis a linear model and whether it has been fitted. If any of these conditions is not satisfied this method should raise an appropriate exception: for a wrong model type this should be aValueError; for an unfit model this should be sklearn’ssklearn.exceptions.NotFittedError(consider using scikit’ssklearn.utils.validation.check_is_fittedfunction to raise this exception).- Returns
- is_validboolean
Trueif the kind of theself.clfmodel is correct and the model is fitted,Falseotherwise.
- Raises
- NotImplementedError
This error is always raised since the method is an abstract method.
-
explain_instance() → numpy.ndarray[source]¶ Generates an explanation of a single data point (instance).
This can be an explanation of a data point from a data set or of a prediction provided by a predictive model.
-
map_class(clf_class: Union[int, str]) → str[source]¶ Maps a class id output by the classifier to a class name.
A mapping will only be provided if the class was initialised with class names or an array of possible predictions was extracted form the classifier.
- Parameters
- clf_classUnion[integer, string]
A class id output by the classifier.
- Returns
- mapped_classstring
A class name corresponding to the class id.
- Raises
- RuntimeError
The error is raised when trying to map a class for a regressor. It is also raised if the class was not sufficiently initialised, i.e., either
classes_arrayorclass_namesattributes are missing.- TypeError
The
clf_classparameter is neither integer nor string.- ValueError
Given
clf_classis not one of the values that the classifier can output.