AI Fairness -A Brief Introduction to AI Fairness 360

9 min readJun 30, 2021

Fairness is becoming one of the most popular topics in machine learning in recent years. There are several newly uploaded papers on Fairness on arxiv every week. But, the first thing to ask is why is fairness so important? We are at an age where to automate most of our tasks, we have become ml dependent. Companies like Amazon using their recommender systems to recommend different items to different groups of people, Netflix customizing their pages based on users, Chatbots, self driven cars, employers using ML systems to select candidates, courts in United States use COMPAS algorithm for recidivism prediction. Machine learning systems have become inseparable part of our lives and are becoming more widely used in near future.

AI is good but it can be incorrect as well. Any machine learning system is as good as the data on which it is trained on. Machine learning discovers and generalizes patterns in the data and could, therefore, replicate bias. It has been found in 2016 that COMPAS, the algorithm used for recidivism prediction produces much higher false positive rate for black people than white people(see Fig1, Larson et al. ProPublica, 2016[1]).

Fig1: Predictions from COMPAS showing high racial biasness

A few more example are, Amazon’s hiring algorithm[2] that favored men, Facebook’s charge of housing discrimination in targeted ads[3], or this prominent healthcare algorithm[4] that exhibited significant racial bias.

When implementing these models at scale, it can result in a large number of biased decisions, harming a large number of users.

Sources of bias

Algorithms by themselves do not have any intrinsic prejudices but can learn to exhibit discriminatory behaviors when presented with inappropriate data. Below are a few data quality issues that contribute to bias:

Insufficient data: When the dataset contains too few samples overall or for certain minority groups. Insights from the models trained on this kind of data are not dependable and caution should be exercised when pertaining to use cases that directly impact individuals.
Data collection: Bias is introduced due to technologies, or humans, used in collecting the data, e.g. the tool is only available in a specific language. It could be a consequence of the sampling strategy, e.g. insufficient representation of a sub-population is collected.
Historical bias: A significant difference in the target distribution for different groups can be due to underlying human prejudices in the data. A few of the well-known examples are discrimination in hiring, loan discrimination practices or bias in judicial sentencing.

For the task of bias detection and mitigation there are a few open source libraries available like IBM’s AI Fairness 360, Audit-AI developed by team at Pymetrics, Fairlearn, Fairml, Google’s what if tool.

AI Fairness 360

The AI Fairness 360[5] toolkit is an extensible open-source library containing techniques developed by the research community to help detect and mitigate bias in machine learning models throughout the AI application lifecycle. AI Fairness 360 package is available in both Python and R.

The AI Fairness 360 package includes

a comprehensive set of metrics for datasets and models to test for biases,
explanations for these metrics, and
algorithms to mitigate bias in datasets and models. It is designed to translate algorithmic research from the lab into the actual practice of domains as wide-ranging as finance, human capital management, healthcare, and education. We invite you to use it and improve it.

Definitions

Fairness metric : A quantification of unwanted bias in training data or models.
Favorable label: A label whose value corresponds to an outcome that provides an advantage to the recipient. The opposite is an unfavorable label.
Group fairness: The goal of groups defined by protected attributes receiving similar treatments or outcomes.
Individual fairness: The goal of similar individuals receiving similar treatments or outcomes.
In-processing algorithm: A bias mitigation algorithm that is applied to a model during its training.
Post-processing algorithm: A bias mitigation algorithm that is applied to predicted labels.
Pre-processing algorithm: A bias mitigation algorithm that is applied to training data.
Privileged protected attribute: A value of a protected attribute indicating a group that has historically been at systematic advantage.
Protected attribute: An attribute that partitions a population into groups whose outcomes should have parity. Examples include race, gender, caste, and religion. Protected attributes are not universal, but are application specific.

Metrics used based on application

If the application is concerned with individual fairness, then the metrics in the SampleDistortionMetric class should be used. If the application is concerned with group fairness, then the metrics in the Dataset Metric class (and in its children classes such as the BinaryLabelDatasetMetric class) as well as the Classification Metric class (except the ones noted in the next sentence) should be used. If the application is concerned with both individual and group fairness, and requires the use of a single metric, then the generalized entropy index and its specializations to Theil index and coefficient of variation in the Classification Metric class should be used.
There are a large number of fairness metrics that may be appropriate for a given application. Fairness can be measured at different points in a machine learning pipeline: either on the training data or on the learned model, which also relates to the pre-processing, in-processing, and post-processing categories of bias mitigation algorithms [6]. If the application requires metrics on training data, the ones in the Dataset Metric class (and in its children classes such as the BinaryLabelDatasetMetric class) should be used. If the application requires metrics on models, the ones in the Classification Metric class should be used.

Few examples of Metrics

Statistical Parity Difference: The difference in the rate of favorable outcomes received by unprivileged group to the privileged group. Ideal value for this is 0, which means there is no biasness present. Negative value for this means that the data is biased towards the privileged group and positive values means, it is biased towards the unprivileged group.

Disparate Impact: Ratio of the rate of favorable outcome for the unprivileged group to the privileged group. Ideal value is 1.

Equal opportunity difference: Difference of the True Positive Rate of unprivileged group to the privileged group. Ideal value for no bias present is 0.

Equation3

Average odds difference: The average difference of false positive rate and true positive rate between unprivileged group to the privileged group. Ideal value is 0.

Algorithms

Bias mitigation algorithms attempt to improve the fairness metrics by modifying the training data, the learning algorithm, or the predictions. These algorithm categories are known as pre-processing, in-processing, and post-processing, respectively [6]. The choice among algorithm categories can partially be made based on the user persona’s ability to intervene at different parts of a machine learning pipeline. If the user is allowed to modify the training data, then pre-processing can be used. If the user is allowed to change the learning algorithm, then in-processing can be used. If the user can only treat the learned model as a black box without any ability to modify the training data or learning algorithm, then only post-processing can be used.

Among pre-processing algorithms, reweighing[7] only changes weights applied to training samples; it does not change any feature or label values. Therefore, it may be a preferred option in case the application does not allow for value changes.
Among in-processing algorithms, the prejudice remover is limited to learning algorithms that allow for regularization terms whereas the adversarial debiasing algorithm[8] allows for a more general set of learning algorithms, and may be preferred for that reason.
Among post-processing algorithms, the two equalized odds post-processing algorithms have a randomized component whereas the reject option algorithm[9] is deterministic, and may be preferred for that reason.

Reweighing Algorithm

The advantage of this approach[11] is, instead of modifying the labels, it assigns different weights to the examples based upon their categories of protected attribute and outcome such that bias is removed from the training dataset. The weights are based on frequency counts. However as this technique is designed to work only with classifiers that can handle row-level weights, this may limit your modeling options.

To demonstrate how this technique can be used to reduce bias, I used the Adult dataset [10]. The binary target in this dataset is whether an individual has an income higher or lower than $50k. It contains several features that are protected by the law in the US, but for simplicity in this post, I will focus on sex. As can be seen in the table below, Male is the privileged group with a 31% probability of having a positive outcome (>$50k) compared to an 11% probability of having a positive outcome for the Female group.

Table1: Frequency counts of each group from Adult dataset

The disparate impact metric, as described in the Equation2 above, is a measure of discrimination in the data. A score of 1 indicates the dataset is discrimination-free. When calculated on the unweighted Adult dataset for Male versus Female, the score is 0.36.

Using the frequency counts in the Table1 above, the reweighing technique will assign weights as follows the below equation5. For example, for the privileged group with the positive outcome (that is, Male with greater than $50k income), the weight is calculated as:

Thus the weights for each category in the training data are:

Table2: weights from each category from applying reweighing algorithm

By applying these weights to the counts, the disparate impact metric would become 1 for the training data and thus now “discrimination-free.”

Adversarial debiasing Algorithm

In adversarial debiasing[12], you’re building two models. The first is predicting your target, based upon whatever feature engineering and pre-processing steps you’ve taken on your training data already. The second model is the adversary, and it tries to predict, based upon the predictions of your first model, the sensitive attribute. Ideally, in a situation without bias, this adversarial model should not be able to predict well the sensitive attribute. The adversarial model, therefore, guides modifications of the original model (via parameters and weighting) that weakens the predictive power of the adversarial model until it cannot predict the protected attributes well based upon the outcomes.

The advantage of this method is that you directly intervene at the learning stage of the modeling workflow. In addition, it can be applied to both classification and regression.

Reject option algorithm

In this algorithm[13], predictions are made based on the optimal classification threshold and the critical region boundary (Reject Option Classification margin) using a validation set which are estimated for the desired constraint on fairness. The best parameters are those that maximize the classification threshold while satisfying the fairness constraints. The constraints can be used on the following fairness measures:

Statistical parity difference on the predictions of the classifier
Average odds difference for the classifier
Equal opportunity difference for the classifier.

Summary

Fairness becomes a very popular topic in ML community in recent years.
Fairness matters because it has impact on everyone’s benefit.
Unfairness in ML systems is mainly due to human bias existing in the training data.
Trade-off between accuracy and fairness usually exists.
There are three streams of methods: preprocessing, optimization at training time, and post-processing. Each has pros and cons.
Most fair algorithms use the sensitive attributes to achieve certain fairness notions. However, such information may not be available in reality and thus exploratory data analysis is important.

This article was co-authored by Saichandra Pandraju and Sakthi Ganesh