Urban management and survey departments have begun investigating the feasibility ofacquiring data from various laser scanning systems for urban infrastructure measurements and assessments. Roadside objects such as cars, trees, traffic poles, pedestrians, bicycles and e-bicycles describe the static and dynamic urban information available for acquisition. Because of the unstructured nature of 3D point clouds, the rich targets in complex road scenes, and the varying scales of roadside objects, finely classifying these roadside objects from various point clouds is a challenging task. In this paper, we integrate two representations of roadside objects, point clouds and multiview images to propose a point-group-view network named PGVNet for classifying roadside objects into cars, trees, traffic poles, and small objects (pedestrians, bicycles and e-bicycles) from generalized point clouds. To utilize the topological information of the point clouds, we propose a graph attention convolution operation called AtEdgeConv to mine the relationship among the local points and to extract local geometric features. In addition, we employ a hierarchical view-group-object architecture to diminish the redundant information between similar views and to obtain salient viewwise global features. To fuse the local geometric features from the point clouds and the global features from multiview images, we stack an attention-guided fusion network in PGVNet. In particular, we quantify and leverage the global features as an attention mask to capture the intrinsic correlation and discriminability of the local geometric features, which contributes to recognizing the different roadside objects with similar shapes. To verify the effectiveness and generalization of our methods, we conduct extensive experiments on six test datasets of different urban scenes, which were captured by different laser scanning systems, including mobile laser scanning (MLS) systems, unmanned aerial vehicle (UAV)-based laser scanning (ULS) systems and backpack laser scanning (BLS) systems. Experimental results, and comparisons with state-of-the-art methods, demonstrate that the PGVNet model is able to effectively identify various cars, trees, traffic poles and small objects from generalized point clouds, and achieves promising performances on roadside object classifications, with an overall accuracy of 95.76%. Our code is released on https://github.com/flidarcode/PGVNet.
Read full abstract