It is an important task to automatically and accurately map rooftops from very high resolution remote sensing images since buildings are very closely related to human activity. Two typical technologies are often utilized to accomplish the task, i.e., semantic segmentation and instance segmentation. The semantic segmentation is to independently allocate a label (e.g., “building” or not) to each pixel, resulting in blob-like segments. On the contrary, one might model the boundary of a rooftop as a polygon to improve the shape of the rooftop by encouraging vertices of polygon to adhere to the rooftop’s boundary. Following this line of work, we present a multitask learning approach to predict rooftop corners in a sequent way using the attention learned from where the boundaries are in a given image region. The approach simulates the process of manual delineation of rooftops’ outline in a given image, which can produce accurate boundaries of rooftops with sharp corners and straight lines between them. Specifically, the proposed method consists of three components, i.e., object detection, pixel-by-pixel classification of both edges and corners, and delineation of rooftops in a sequent manner using a convolutional recurrent neural network (RNN). It is called as object-oriented, edges and corners (OEC)-RNN in this article. Three image datasets of buildings are employed to validate the performance of the OEC-RNN, which are compared with state-of-the-art methods for instance segmentation. The experimental results show that the OEC-RNN achieves the best performance in terms of overlay, boundary adherence, and vertex location between ground-truth and predicted polygons.
Read full abstract