Abstract

Instance segmentation in high-resolution (HR) remote sensing imagery is one of the most challenging tasks and is more difficult than object detection and semantic segmentation tasks. It aims to predict class labels and pixel-wise instance masks to locate instances in an image. However, there are rare methods currently suitable for instance segmentation in the HR remote sensing images. Meanwhile, it is more difficult to implement instance segmentation due to the complex background of remote sensing images. In this article, a novel instance segmentation approach of HR remote sensing imagery based on Cascade Mask R-CNN is proposed, which is called a high-quality instance segmentation network (HQ-ISNet). In this scheme, the HQ-ISNet exploits a HR feature pyramid network (HRFPN) to fully utilize multi-level feature maps and maintain HR feature maps for remote sensing images’ instance segmentation. Next, to refine mask information flow between mask branches, the instance segmentation network version 2 (ISNetV2) is proposed to promote further improvements in mask prediction accuracy. Then, we construct a new, more challenging dataset based on the synthetic aperture radar (SAR) ship detection dataset (SSDD) and the Northwestern Polytechnical University very-high-resolution 10-class geospatial object detection dataset (NWPU VHR-10) for remote sensing images instance segmentation which can be used as a benchmark for evaluating instance segmentation algorithms in the high-resolution remote sensing images. Finally, extensive experimental analyses and comparisons on the SSDD and the NWPU VHR-10 dataset show that (1) the HRFPN makes the predicted instance masks more accurate, which can effectively enhance the instance segmentation performance of the high-resolution remote sensing imagery; (2) the ISNetV2 is effective and promotes further improvements in mask prediction accuracy; (3) our proposed framework HQ-ISNet is effective and more accurate for instance segmentation in the remote sensing imagery than the existing algorithms.

Highlights

  • With the rapid development of imaging technology in the field of remote sensing, high-resolution (HR) remote sensing images are provided by many airborne and spaceborne sensors, for instance, RADARSAT-2, Gaofen-3, TerraSAR-X, Sentinel-1, Ziyuan-3, Gaofen-2 and unmanned aerial vehicles (UAV)

  • We construct a new, more challenging dataset based on the synthetic aperture radar (SAR), ship detection dataset (SSDD) and the Northwestern Polytechnical University very-high-resolution 10-class geospatial object detection dataset (NWPU VHR-10) for remote sensing images’ instance segmentation, which can be used as a benchmark for evaluating instance segmentation algorithms in the HR remote sensing images

  • An HR feature pyramid networks (HRFPN) replaces the original FPN to fully utilize multi-level feature maps; nextT, htheefrcaamnedwidoartke opfrHopQo-sIaSlNs eatrbeagseendeornatCedasbcyadteheMRasPkNR; -fCinNaNlly[,21a]nisinsshtoanwcne isnegFmiguenreta3t.ioFnirst, an netwoHrkRvfeerastiuorne2py(IrSaNmeitdVn2)etiws ourskesd(tHoRreFfPinNe) trheeploarciegsinthalemoraisgkinbarlaFnPcNhetsoafnudllyisuetxileiczuetmedulttoi-olebvtealinfeature maps; the candidate proposals are generated by the RPN; an instance segmentation neRtwemoortke Svenesr.s2i0o1n9,211(,IxSNFOeRtVP2E)ERisRuEsVeIdEWto refine the original mask branches and is executed to obtain t5hoef 24 fintahleinfsitnaanlceinssetgamnceentsaetigomn erenstuatlitos.nInrethsuisltsse.ctIinont,hwise sweicltliponre,sewnet owurilpl rporpeosseendt ionustranpcreospeogsmedenitnatsitoannce apspergomacehnitnatdioentaailp. proach in detail

Read more

Summary

Introduction

With the rapid development of imaging technology in the field of remote sensing, high-resolution (HR) remote sensing images are provided by many airborne and spaceborne sensors, for instance, RADARSAT-2, Gaofen-3, TerraSAR-X, Sentinel-1, Ziyuan-3, Gaofen-2 and unmanned aerial vehicles (UAV). Nowadays, these HR images have been applied to the national economy and the military fields, such as urban monitoring, ocean monitoring, maritime management, and traffic planning [1,2,3]. NHtoowuresveorf, thesoerdigeitnecatliotnarrgeestusl.ts with the bounding boxes and the rotational bounding boxes do not reflect the pixel-level contours of the original targets An eStAaRl. i[m12a]gceas.mDeenugp ewt iathl. a[1D1]RdBeovxis-evd awmithethroodtattaobdleetbecotxmesutlotisbcaoloestartthifeicpiarletcairsgioetns ainndrermecoatell rates of dseetnescitnigonimfoagreos.bAjenctetdaelt.e[c1t2i]ocnamine HupRwSiAthRa iDmRaBgoexs-.v2Xwiaitoh ertotaalt.ab[1le3]bocxaems etoubpoowstitthheapnreocviseiol nanchor genearnadtiroencaalllgraotreisthomf dteoteecltiiomninfoarteobthjeectddeefiteccietinoncieins HinRthSeARpriemvaiogeuss. aXniacoheotr-abl.a[s1e3d] cdaemteecutoprws.itHhoawever, thesneodvetleacnticohnorregseunltesrawtioitnh tahlgeobriothumndtiongelbimoxineastaentdhethdeerfoictiaetnicoinesalinbotuhnedpinregviboouxsesandcohonro-btarseefldect the pixedl-elteevcetolrcso. nHtoowuresveorf, thesoerdigeitnecatliotnarrgeestusl.ts with the bounding boxes and the rotational bounding boxes do not reflect the pixel-level contours of the original targets

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call