Abstract

We address the issue of the semantic segmentation of large-scale 3D scenes by fusing 2D images and 3D point clouds. First, a Deeplab-Vgg16 based Large-Scale and High-Resolution model (DVLSHR) based on deep Visual Geometry Group (VGG16) is successfully created and fine-tuned by training seven deep convolutional neural networks with four benchmark datasets. On the val set in CityScapes, DVLSHR achieves a 74.98% mean Pixel Accuracy (mPA) and a 64.17% mean Intersection over Union (mIoU), and can be adapted to segment the captured images (image resolution 2832 ∗ 4256 pixels). Second, the preliminary segmentation results with 2D images are mapped to 3D point clouds according to the coordinate relationships between the images and the point clouds. Third, based on the mapping results, fine features of buildings are further extracted directly from the 3D point clouds. Our experiments show that the proposed fusion method can segment local and global features efficiently and effectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.