Abstract

Abstract. The extraction of building facades based on image sequences has a great contribution to the construction of digital realistic cities. The superpixel segmentation algorithm is a pre-processing tool for segmentation because of its advantages of fast speed, universality and great accuracy. However, the 2D features are less reliable because building facades usually have complex texture and geometric feature. It is difficult to obtain accurate detail information of the façades by clustering the superpixels. Moreover, the process of acquiring building image sequences is easily disturbed by environmental factors, which also leads to the poor results of the superpixel segmentation. In this paper, 3D local pose-varied semantic features of buildings are defined for this problem, which are computed by 3D point clouds generated from multi-view images of buildings based on SfM and PMVS. Then, multi-modal superpixels with integration of 2D texture and 3D pose-varied semantic features are computed by using fully convolutional networks. The new method is compared with traditional superpixel segmentation method by standard superpixel segmentation result evaluation metrics such as achievable segmentation accuracy , boundary recall, and undersegmentation error. The method achieve accurate segmentation results and effectively exclude the influence of complex texture and environmental factors. In summary, The multi-modal superpixels obtained by the integration have better reliability and provide a new idea for the superpixel segmentation of building facades, which has important theoretical and practical significance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call