Computes information needed for evaluating and remedying sampling bias.

Computes the number of instances per sub-population based on the number of indices per sub-population and the weights that can be used for cost-sensitive learning to mitigate the sampling bias.

This is an alternative to fatf.accountability.data.measures.sampling_bias function, which can be used when one already has the desired instance binning.

For warnings and errors raised by this method please see the documentation of fatf.utils.data.tools.validate_indices_per_bin function.


A list of lists with the latter one holding row indices of a particular group (sub-population).


A number of data points for each sub-population defined by partitioning of the selected feature.


A weight for every instance (that could be grouped, i.e. assigned to one of the sub-populations) in the input dataset. The weights are useful for training a cost-sensitive classifier to mitigate the sampling bias. The weights are inversely proportional to the number of instance occurrences for every sub-population.