Abstract
Vision-based building scene understanding (BSU) is essential to many applications such as indoor navigation and 3D reconstruction. Deep learning excels in various BSU tasks but its training relies heavily on large-scale datasets, which are often burdensome and costly to obtain. This study proposes a universal and automated approach to generate synthetic building images with comprehensive annotations that support twelve BSU tasks, including camera pose estimation, scene recognition, depth estimation, 2D/3D object and object part detection, object and material semantic segmentation, object and object part instance segmentation, and panoptic segmentation. This approach captures and renders perspective views of indoor BIM model scenes to generate photorealistic images. Required annotations are computed by a ray tracing-based algorithm accelerated by two-tier axis-aligned bounding box tree indexing. The approach was validated by efficiently producing annotated synthetic images from various BIM models, and preliminary experiments demonstrated the promising efficacy of the synthetic images for deep learning model training.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have