Abstract

In the past several years, generative adversarial networks have emerged that are capable of creating realistic synthetic images of human faces. Because these images can be used for malicious purposes, researchers have begun to develop techniques to synthetic images. Currently, the majority of existing techniques operate by searching for statistical traces introduced when an image is synthesized by a GAN. An alternative approach that has received comparatively less research involves using semantic inconsistencies detect synthetic images. While GAN-generated synthetic images appear visually realistic at first glance, they often contain subtle semantic inconsistencies such as inconsistent eye highlights, misaligned teeth, unrealistic hair textures, etc. In this paper, we propose a new approach to detect GAN-generated images of human faces by searching for semantic inconsistencies in multiple different facial features such as the eyes, mouth, and hair. Synthetic image detection decisions are made by fusing the outputs of these facial-feature-level detectors. Through a series of experiments, we demonstrate that this approach can yield strong synthetic image detection performance. Furthermore, we experimentally demonstrate that our approach is less susceptible to performance degradations caused by post-processing than CNN-based detectors utilize statistical traces.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call