Deep learning has revolutionized many scene perception tasks over the past decade. Some of these improvements can be attributed to the development of large labeled datasets. The creation of such datasets can be an expensive, time-consuming, and imperfect process. To address these issues, we introduce GeoSynth, a diverse photorealistic synthetic dataset for indoor scene understanding tasks. Each GeoSynth exemplar contains rich labels including segmentation, geometry, camera parameters, surface material, lighting, and more. We demonstrate that supplementing real training data with GeoSynth can significantly improve network performance on perception tasks, like semantic segmentation. A subset of our dataset will be made publicly available at https://github.com/geomagical/GeoSynth.
Read full abstract