Three-dimensional (3D) scene reconstruction plays an important role in digital cities, virtual reality, and simultaneous localization and mapping (SLAM). In contrast to perspective images, a single panoramic image can contain the complete scene information because of the wide field of view. The extraction and matching of image feature points is a critical and difficult part of 3D scene reconstruction using panoramic images. We attempted to solve this problem using convolutional neural networks (CNNs). Compared with traditional feature extraction and matching algorithms, the SuperPoint (SP) and SuperGlue (SG) algorithms have advantages for handling images with distortions. However, the rich content of panoramic images leads to a significant disadvantage of these algorithms with regard to time loss. To address this problem, we introduce the Improved Cube Projection Model: First, the panoramic image is projected into split-frame perspective images with significant overlap in six directions. Second, the SP and SG algorithms are used to process the six split-frame images in parallel for feature extraction and matching. Finally, matching points are mapped back to the panoramic image through coordinate inverse mapping. Experimental results in multiple environments indicated that the algorithm can not only guarantee the number of feature points extracted and the accuracy of feature point extraction but can also significantly reduce the computation time compared to other commonly used algorithms.
Read full abstract