Sources of Bias in Generative AI
Rosie Campbell, a member of the Trust and Safety team at OpenAI, notes that bias is an overloaded term in AI. Bias can either mean that the training data was poorly sampled and does not accurately reflect the real world or that the training data reflects the real world, but the world itself is biased in ways that we don’t endorse.
One common example of bias in generative AI is image generation tools perpetuating outdated stereotypes. For instance, a tool may return images of a middle-aged white man in a suit when asked to generate a CEO or a young woman when asked to generate a flight attendant. Additionally, text models can exhibit bias in more subtle ways, such as using different adjectives when describing male characters vs. female characters or making offensive assumptions about someone’s background or personality based on their race.
Detecting and Addressing Bias in Generative AI
One of the significant challenges in addressing bias in generative AI is how hard it is to detect. It’s usually impossible to tell from an individual output whether a system is biased overall. Additionally, the internal representations developed by neural networks are largely incomprehensible to humans, making it challenging to identify bias.
To curb bias in AI, it is essential to pay special attention to how you collect and sanitise your training data, keeping in mind whether it is representative of your intended use case. In addition, fine-tuning the model and thoroughly testing your system in its intended domain can help mitigate negative consequences. Furthermore, audits and research on benchmarks can help identify signs of bias. Even if the model itself might be biased, it is sometimes possible to mitigate negative consequences in deployment. Having an accountable “human in the loop” to verify the outputs and decisions of AI systems is useful mitigation.
Reinforcing and Amplifying Biases in Generative AI
As we continue to develop more complex generative AI models, we need to consider how biases can be reinforced and amplified. For example, if a poor job is done sampling images for an image generation model and it is more likely to produce purple-toned images than anything else, these images can end up in the training dataset for future models. Even if the sampling method is fixed, the dataset will skew even more purple, making the new model more likely to produce purple-toned images.
Additionally, a biased system could disproportionately benefit or harm different groups. For instance, a medical AI tool might be great at designing treatment plans for common conditions but may be much worse than the status quo when addressing rare health concerns. As we incorporate AI into different workflows, we should carefully consider how the distribution of benefits may change and who might be negatively affected by it.
While AI has the potential to bring tremendous benefits, it’s important to recognise that it’s not free from biases. The challenge ahead of us is to ensure that we are enabling positive use cases while minimising the risks. As we continue to develop more complex generative AI models, we must work towards mitigating biases while also improving our collective literacy on the strengths and limitations of these systems. By doing so, we can ensure that AI is beneficial to humans and aligned with human values. It’s a pretty exciting time to be alive, don’t you agree?