fatf.fairness.data.measures
.systemic_bias¶
-
fatf.fairness.data.measures.
systemic_bias
(dataset: numpy.ndarray, ground_truth: numpy.ndarray, protected_features: List[Union[str, int]]) → numpy.ndarray[source]¶ Checks for systemic bias in a dataset.
This function checks whether there exist data points that share the same unprotected features but differ in protected features. For all of these instances their label (ground truth) will be checked and if it is different, a particular data points pair will be indicated to be biased. This dependency is represented as a boolean, square numpy array that shows whether systemic bias exists (
True
) for any pair of data points.- Parameters
- datasetnumpy.ndarray
A dataset to be evaluated for systemic bias.
- ground_truthnumpy.ndarray
The labels corresponding to the dataset.
- protected_featuresList[column index]
A list of column indices in the dataset that hold protected attributes.
- Returns
- systemic_bias_matrixnumpy.ndarray
A square, diagonally symmetrical and boolean numpy array that indicates which pair of data point share the same unprotected features but differ in protected features and the ground truth annotation.
- Raises
- IncorrectShapeError
The dataset is not a 2-dimensional numpy array, the ground truth is not a 1-dimensional numpy array or the number of rows in the dataset is not equal to the number of elements in the ground truth array.
- IndexError
Some of the column indices given in the
protected_features
list are not valid for the input dataset.- TypeError
The
protected_features
parameter is not a list.- ValueError
There are duplicate values in the protected feature indices list.