3D Semantic Scene Completion From A Depth Map With Unsupervised Learning For Semantics Prioritisation
The Semantic Scene Completion (SSC) problem entails generating a comprehensive 3D voxel representation of a scene from a partial view, while simultaneously predicting volumetric occupancy and object category. A significant challenge in SSC is evaluating occluded regions in 3D space and accurately predicting object categories within an imbalanced setting. In addressing this challenge, our study explores SSC literature and introduces a simple, innovative class-balancing re-weighting technique rooted in an unsupervised clustering, leading to balanced learning and generalised representation. This method modulates the penalty on dataset classes during the CNN learning, emphasizing infrequent classes while moderately de-prioritizing the dominant ones, combining the strengths of both re-sampling and cost-sensitive learning enhancing the performance for both scene completion and scene semantics tasks. Our design, which relies on a single depth input without any RGB information, has shown to significantly outperform comparable baseline models. Our results are also competitively matched with other multi-input methods.