Abstract

ABSTRACT High-Level Structure (HLS) extraction in aerial images consists of recognizing Three-Dimensional (3D) elements on human-made surfaces (objects, buildings, ground, etc.). There are several approaches to HLS extraction in aerial images. However, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds extracted from the camera images. In general, 3D point cloud and multiple view approaches have good performance for certain scenes with video sequences or image sequences, but they need sufficient parallax in order to guarantee accuracy. To address this problem, an alternative is to process a single image seeking to interpret areas of the images where the human-made structure may be observed, thus removing parallax dependency, but adding the challenge of having to interpret image ambiguities correctly. Motivated by the latter, this work presents the results of a novel method for HLS extraction from a single image. Our interest is the buildings structures extraction in urbanized aerial images. For that, our method has six steps. First, we use a new Convolutional Neural Network (CNN) architecture to recognize the labels (tree, roof, and floor) in the input image. Second, we use a CNN to predict the depth. Third, we divide the input image using a superpixel technique. Fourth, we segment the superpixels with its majority label. Fifth, we recognize the structures using a proposed connection analysis that connects the adjacent superpixels with equal labels (tree, roof, and floor). Finally, we use a geometric analysis with the depth prediction of the labels recognized that extracts the 3D shape of the building structure.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call