.. note::
    :class: sphx-glr-download-link-note

    Click :ref:`here <sphx_glr_download_sphinx_gallery_auto_accountability_xmpl_accountability_data_measure.py>` to download the full example code or run this example in your browser via Binder
.. rst-class:: sphx-glr-example-title

.. _sphx_glr_sphinx_gallery_auto_accountability_xmpl_accountability_data_measure.py:


===================================================
Measuring Robustness of a Data Set -- Sampling Bias
===================================================

This example illustrates how to identify Sampling Bias for a data set grouping
for a selected feature.


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    The counts for groups defined on "petal length (cm)" feature (index 2) are:
        * For the population split *x <= 2.5* there are: 50 data points.
        * For the population split *2.5 < x <= 4.75* there are: 45 data points.
        * For the population split *4.75 < x* there are: 55 data points.

    The Sampling Bias for *petal length (cm)* feature (index 2) grouping is:
        * For "x <= 2.5" and "2.5 < x <= 4.75" groupings there >is no< Sampling Bias.
        * For "x <= 2.5" and "4.75 < x" groupings there >is no< Sampling Bias.
        * For "2.5 < x <= 4.75" and "4.75 < x" groupings there >is< Sampling Bias.


|


.. code-block:: default

    # Author: Kacper Sokol <k.sokol@bristol.ac.uk>
    # License: new BSD

    import fatf.utils.data.datasets as fatf_datasets

    import fatf.accountability.data.measures as fatf_dam

    print(__doc__)

    # Load data
    iris_data_dict = fatf_datasets.load_iris()
    iris_X = iris_data_dict['data']
    iris_y = iris_data_dict['target'].astype(int)
    iris_feature_names = iris_data_dict['feature_names']
    iris_class_names = iris_data_dict['target_names']

    # Select a feature for which the Sampling Bias be measured
    selected_feature_index = 2
    selected_feature_name = iris_feature_names[selected_feature_index]

    # Define grouping on the selected feature
    selected_feature_grouping = [2.5, 4.75]

    # Get counts, weights and names of the specified grouping
    grp_counts, grp_weights, grp_names = fatf_dam.sampling_bias(
        iris_X, selected_feature_index, selected_feature_grouping)

    # Print out counts per group
    print('The counts for groups defined on "{}" feature (index {}) are:'
          ''.format(selected_feature_name, selected_feature_index))
    for g_name, g_count in zip(grp_names, grp_counts):
        is_are = 'is' if g_count == 1 else 'are'
        print('    * For the population split *{}* there {}: '
              '{} data points.'.format(g_name, is_are, g_count))

    # Get the disparity grid
    bias_grid = fatf_dam.sampling_bias_grid_check(grp_counts)

    # Print out disparity per every grouping pair
    print('\nThe Sampling Bias for *{}* feature (index {}) grouping is:'
          ''.format(selected_feature_name, selected_feature_index))
    for grouping_i, grouping_name_i in enumerate(grp_names):
        j_offset = grouping_i + 1
        for grouping_j, grouping_name_j in enumerate(grp_names[j_offset:]):
            grouping_j += j_offset
            is_not = '' if bias_grid[grouping_i, grouping_j] else ' no'

            print('    * For "{}" and "{}" groupings there >is{}< Sampling Bias.'
                  ''.format(grouping_name_i, grouping_name_j, is_not))


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.112 seconds)


.. _sphx_glr_download_sphinx_gallery_auto_accountability_xmpl_accountability_data_measure.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: binder-badge

    .. image:: https://mybinder.org/badge_logo.svg
      :target: https://mybinder.org/v2/gh/fat-forensics/fat-forensics-doc/master?filepath=notebooks/sphinx_gallery_auto/accountability/xmpl_accountability_data_measure.ipynb
      :width: 150 px


  .. container:: sphx-glr-download

     :download:`Download Python source code: xmpl_accountability_data_measure.py <xmpl_accountability_data_measure.py>`


  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: xmpl_accountability_data_measure.ipynb <xmpl_accountability_data_measure.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_