With the continuous advancement of the construction of smart cities, the availability of large-scale and semantically enriched datasets is essential for enhancing the machine’s ability to understand urban scenes. Mesh data have a distinct advantage over point cloud data for large-scale scenes, as they can provide inherent geometric topology information and consume less memory space. However, existing publicly available large-scale scene mesh datasets are limited in scale and semantic richness and do not cover a wide range of urban semantic information. The development of 3D semantic segmentation algorithms depends on the availability of datasets. Moreover, existing large-scale 3D datasets lack various types of official annotation data, which hinders the widespread applicability of benchmark applications and may cause label errors during data conversion. To address these issues, we present a comprehensive urban-scale semantic segmentation benchmark dataset. It is suitable for various research pursuits on semantic segmentation methodologies. This dataset contains finely annotated point cloud and mesh data types for 3D, as well as high-resolution original 2D images with detailed 2D semantic annotations. It is constructed from a 3D reconstruction of 10,840 UVA aerial images and spans a vast area of approximately 2.85 square kilometers that covers both urban and rural scenes. The dataset is composed of 152,298,756 3D points and 289,404,088 triangles. Each 3D point, triangular mesh, and the original 2D image in the dataset are carefully labeled with one of the ten semantic categories. Six typical 3D semantic segmentation methods were compared on the CUS3D dataset, with KPConv demonstrating the highest overall performance. The mIoU is 59.72%, OA is 89.42%, and mAcc is 97.88%. Furthermore, the experimental results on the impact of color information on semantic segmentation suggest that incorporating both coordinate and color features can enhance the performance of semantic segmentation. The current limitations of the CUS3D dataset, particularly in class imbalance, will be the primary target for future dataset enhancements.
Read full abstract