Abstract

Abstract. Current 2D and 3D semantic segmentation frameworks are developed and trained on specific benchmark datasets, often rich of synthetic data, and when they are applied to complex and real-world heritage scenarios they offer much lower accuracy than expected. In this work, we present and demonstrate an early and late fusion of methods for semantic segmentation in cultural heritage applications. We rely on image datasets, point clouds and BIM models. The early fusion utilizes multi-view rendering to generate RGBD imagery of the scene. In contrast, the late fusion approach merges image-based segmentation with a Point Transformer applied to point clouds. Two scenarios are considered and inference results show that predictions are primarily influenced by whether the scene has a predominantly geometric or texture-based signature, underscoring the necessity of fusion methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.