Abstract

Instance segmentation of high-resolution aerial images is challenging when compared to object detection and semantic segmentation in remote sensing applications. It adopts boundary-aware mask predictions, instead of traditional bounding boxes, to locate the objects-of-interest in pixel-wise. Meanwhile, instance segmentation can distinguish the densely distributed objects within a certain category by a different color, which is unavailable in semantic segmentation. Despite the distinct advantages, there are rare methods which are dedicated to the high-quality instance segmentation for high-resolution aerial images. In this paper, a novel instance segmentation method, termed consistent proposals of instance segmentation network (CPISNet), for high-resolution aerial images is proposed. Following top-down instance segmentation formula, it adopts the adaptive feature extraction network (AFEN) to extract the multi-level bottom-up augmented feature maps in design space level. Then, elaborated RoI extractor (ERoIE) is designed to extract the mask RoIs via the refined bounding boxes from proposal consistent cascaded (PCC) architecture and multi-level features from AFEN. Finally, the convolution block with shortcut connection is responsible for generating the binary mask for instance segmentation. Experimental conclusions can be drawn on the iSAID and NWPU VHR-10 instance segmentation dataset: (1) Each individual module in CPISNet acts on the whole instance segmentation utility; (2) CPISNet* exceeds vanilla Mask R-CNN 3.4%/3.8% AP on iSAID validation/test set and 9.2% AP on NWPU VHR-10 instance segmentation dataset; (3) The aliasing masks, missing segmentations, false alarms, and poorly segmented masks can be avoided to some extent for CPISNet; (4) CPISNet receives high precision of instance segmentation for aerial images and interprets the objects with fitting boundary.

Highlights

  • With the rapid development of observation and imaging techniques in the remote sensing field, the quantity and quality of very high-resolution (VHR) optical remote sensing images provided by airborne and spaceborne sensors have significantly increased, which simultaneously puts forward new demands on automatic analysis and understanding of remote sensing images

  • Compared to the cascaded methods, consistent proposals of instance segmentation network (CPISNet) still maintains over 1% AP increments (1.7% AP, 1.2% AP, and 1.3% AP increments than CM Region with Convolutional Neural Network (R-CNN), HTC, and SCNet, respectively) with reduced model size

  • Similar to the experiments on iSAID, we supplement the instance segmentation experiments on NWPU VHR-10 dataset to verify the rationality of CPISNet

Read more

Summary

Introduction

With the rapid development of observation and imaging techniques in the remote sensing field, the quantity and quality of very high-resolution (VHR) optical remote sensing images provided by airborne and spaceborne sensors have significantly increased, which simultaneously puts forward new demands on automatic analysis and understanding of remote sensing images. The VHR images are applied in a wide scope of fields, e.g., urban planning, precision agriculture, and traffic monitoring. With the strong feature extraction and end-to-end training capabilities, deep convolutional neural network (DCNN)-based algorithms show their superiority in the sub-tasks of computer vision, such as object detection, semantic segmentation, and instance segmentation. 2021, 13, 2788 various methods which are combined with DCNN for intelligent interpretation in remote sensing images. Object detection in remote sensing images can be divided into traditional methods, machine learning-based methods, and deep learning-based methods.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call