fatf.accountability.data.measures
.sampling_bias_indexed¶

fatf.accountability.data.measures.
sampling_bias_indexed
(indices_per_bin: List[List[int]]) → Tuple[List[int], numpy.ndarray][source]¶ Computes information needed for evaluating and remedying sampling bias.
Computes the number of instances per subpopulation based on the number of indices per subpopulation and the weights that can be used for costsensitive learning to mitigate the sampling bias.
This is an alternative to
fatf.accountability.data.measures.sampling_bias
function, which can be used when one already has the desired instance binning.For warnings and errors raised by this method please see the documentation of
fatf.utils.data.tools.validate_indices_per_bin
function. Parameters
 indices_per_binList[List[integer]]
A list of lists with the latter one holding row indices of a particular group (subpopulation).
 Returns
 countsList[integers]
A number of data points for each subpopulation defined by partitioning of the selected feature.
 weightsnumpy.ndarray
A weight for every instance (that could be grouped, i.e. assigned to one of the subpopulations) in the input
dataset
. The weights are useful for training a costsensitive classifier to mitigate the sampling bias. The weights are inversely proportional to the number of instance occurrences for every subpopulation.