fatf.fairness.data.measures.systemic_bias(dataset: numpy.ndarray, ground_truth: numpy.ndarray, protected_features: List[Union[str, int]]) → numpy.ndarray[source]

Checks for systemic bias in a dataset.

This function checks whether there exist data points that share the same unprotected features but differ in protected features. For all of these instances their label (ground truth) will be checked and if it is different, a particular data points pair will be indicated to be biased. This dependency is represented as a boolean, square numpy array that shows whether systemic bias exists (True) for any pair of data points.


A dataset to be evaluated for systemic bias.


The labels corresponding to the dataset.

protected_featuresList[column index]

A list of column indices in the dataset that hold protected attributes.


A square, diagonally symmetrical and boolean numpy array that indicates which pair of data point share the same unprotected features but differ in protected features and the ground truth annotation.


The dataset is not a 2-dimensional numpy array, the ground truth is not a 1-dimensional numpy array or the number of rows in the dataset is not equal to the number of elements in the ground truth array.


Some of the column indices given in the protected_features list are not valid for the input dataset.


The protected_features parameter is not a list.


There are duplicate values in the protected feature indices list.

Examples using fatf.fairness.data.measures.systemic_bias