Studying Bias in GANs through the Lens of Race

@inproceedings{Maluleke2022StudyingBI,
  title={Studying Bias in GANs through the Lens of Race},
  author={Vongani Hlavutelo Maluleke and Neerja Thakkar and Tim Brooks and Ethan Weber and Trevor Darrell and Alexei A. Efros and Angjoo Kanazawa and Devin Guillory},
  booktitle={European Conference on Computer Vision},
  year={2022}
}
. In this work, we study how the performance and evaluation of generative image models are impacted by the racial composition of their training datasets. By examining and controlling the racial distributions in various training datasets, we are able to observe the impacts of different training distributions on generated image quality and the racial distributions of the generated images. Our results show that the racial compositions of generated images successfully preserve that of the training… 

Figures and Tables from this paper

How to Boost Face Recognition with StyleGAN?

It is shown that a simple approach based on tuning an encoder for StyleGAN allows to improve upon the state-of-the-art facial recognition and performs better compared to training on synthetic face identities.

References

SHOWING 1-10 OF 50 REFERENCES

Training Generative Adversarial Networks with Limited Data

It is demonstrated, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images, and is expected to open up new application domains for GANs.

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

This work proposes a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions and introduces the "Frechet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score.

A Style-Based Generator Architecture for Generative Adversarial Networks

An alternative generator architecture for generative adversarial networks is proposed, borrowing from style transfer literature, that improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation.

Imagining an Engineer: On GAN-Based Data Augmentation Perpetuating Biases

It is shown that starting with a dataset consisting of head-shots of engineering researchers, GAN-based augmentation "imagines" synthetic engineers, most of whom have masculine features and white skin color (inferred from a human subject study conducted on Amazon Mechanical Turk).

Large Scale GAN Training for High Fidelity Natural Image Synthesis

It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.

Classifier-Free Diffusion Guidance

This work shows that guidance can be performed by a pure generative model without such a classifier, and that it is possible to combine the resulting conditional and unconditional scores to attain a trade-off between sample quality and diversity similar to that obtained using classi-classi-er guidance.

Rethinking Common Assumptions to Mitigate Racial Bias in Face Recognition Datasets

In these experiments, training on only African faces induced less bias than training on a balanced distribution of faces and distributions skewed to include more African faces produced more equitable models, and adding more images of existing identities to a dataset in place of adding new identities can lead to accuracy boosts across racial categories.

Moving beyond “algorithmic bias is a data problem”

Learning Transferable Visual Models From Natural Language Supervision

It is demonstrated that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet.