An Optimized Faster R-CNN Method Based on DRNet and RoI Align for Building Detection in Remote Sensing Images

Tong Bai,Kaining Han,Jiasai Luo,Jinzhao Lin,Junchao Wang,Hui Zhang,Yu Pang,Jun Wu,Huiqian Wang

doi:10.3390/rs12050762

Abstract

In recent years, the increase of satellites and UAV (unmanned aerial vehicles) has multiplied the amount of remote sensing data available to people, but only a small part of the remote sensing data has been properly used; problems such as land planning, disaster management and resource monitoring still need to be solved. Buildings in remote sensing images have obvious positioning characteristics; thus, the detection of buildings can not only help the mapping and automatic updating of geographic information systems but also have guiding significance for the detection of other types of ground objects in remote sensing images. Aiming at the deficiency of traditional building remote sensing detection, an improved Faster R-CNN (region-based Convolutional Neural Network) algorithm was proposed in this paper, which adopts DRNet (Dense Residual Network) and RoI (Region of Interest) Align to utilize texture information and to solve the region mismatch problems. The experimental results showed that this method could reach 82.1% mAP (mean average precision) for the detection of landmark buildings, and the prediction box of building coordinates was relatively accurate, which improves the building detection results. Moreover, the recognition of buildings in a complex environment was also excellent.

Highlights

High-resolution remote sensing images can describe the geometric features, spatial features and texture features of ground objects more precisely than traditional ones, which are widely used in various fields
Buildings are a major part of ground objects and the main component of topographic map mapping [1]
The feature diagram of c × 512 × 16 × 16 output by the model was used as the input of the RPN module to further extract the candidate diagram and make the category prediction, and at the same time, it was used as the mapping feature diagram of the Region of Interest (RoI) layer to keep it consistent with the original algorithm

Summary

Introduction

High-resolution remote sensing images can describe the geometric features, spatial features and texture features of ground objects more precisely than traditional ones, which are widely used in various fields. The most important network structure in the deep learning algorithm is the CNN (Convolutional Neural Network) structure, which has the advantage of enabling the computer to automatically extract feature information [3]. Ma et al used features extracted from the deep convolutional neural network trained on the object recognition data set to improve tracking accuracy and robustness [13]. The Faster R-CNN algorithm has achieved excellent results in the field of target detection and recognition, and the performance of deep learning has been greatly improved. There is a problem of mismatch between the actual candidate boxes and the obtained candidate boxes

Some Improvement Methods of RPN

Proposed Optimized Method for Faster R-CNN

Description of RoI Layer