Abstract

The use of facial recognition technology in fields such as law enforcement is creating increased concern about algorithm-embedded biases. MIT research recently demonstrated that commercial facial recognition technology performs poorly on people of color, specifically women. The objective of this work is to implement a variational autoencoder (VAE) that generates low-dimensional latent representations of faces and then to analyze these low-dimensional latent representations to interpret potentially learned biases. While the field of interpreting facial recognition biases is still emerging, previous work has also relied on VAEs to better understand the relationship between images and their latent representations. We implement a 10-layer VAE and analyze its image encoding to a single latent feature and to ten latent features. For the single latent feature encoding, through observing the images corresponding to the highest and lowest activation values, we hypothesized that the model focuses on the overall brightness and darkness of an image. For example, faces of darker tones are assigned the lowest values and faces with lighter tones are assigned the highest values. To test this hypothesis, we manually input values for the single latent feature into the decoder at equally spaced increments and observed the reconstructed images. The reconstructed images support the hypothesis---as the latent feature value increases, the image appears to become brighter with whiter face attributes. This shows that protected features such as race (e.g. skin color) play a role in latent representations. This lays the groundwork for interpreting the encodings of individual latent features to address algorithmic biases. Future work could entail finding a set of latent features that more accurately represents said protected characteristics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call