Data augmentation is commonly utilized to increase the size and diversity of training sets for deep learning tasks. In this study, we propose a novel application of an existing image generation approach in the domain of realistic eye images that leverages data collected from 40 subjects. This hybrid method combines the benefits of precise control over the image content provided by 3D rendering, while introducing the previously lacking photorealism and diversity into synthetic images through neural style transfer. We prove its general efficacy as a data augmentation tool for appearance-based gaze estimation when generated data are mixed with a sparse train set of real images. It improved the results for 39 out of 40 subjects, with an 11.22% mean and a 19.75% maximum decrease in gaze estimation error, achieving similar metrics for train and held-out subjects. We release our data repository of eye images with gaze labels used in this work for public access.
Read full abstract