Abstract

Although remarkable progress has been made in salient object detection (SOD) in natural scene images (NSI), the SOD of optical remote sensing images (RSI) still faces significant challenges due to various spatial resolutions, cluttered backgrounds, and complex imaging conditions, mainly for two reasons: (1) accurate location of salient objects; and (2) subtle boundaries of salient objects. This paper explores the inherent properties of multi-level features to develop a novel semantic-guided attention refinement network (SARNet) for SOD of NSI. Specifically, the proposed semantic guided decoder (SGD) roughly but accurately locates the multi-scale object by aggregating multiple high-level features, and then this global semantic information guides the integration of subsequent features in a step-by-step feedback manner to make full use of deep multi-level features. Simultaneously, the proposed parallel attention fusion (PAF) module combines cross-level features and semantic-guided information to refine the object’s boundary and highlight the entire object area gradually. Finally, the proposed network architecture is trained through an end-to-end fully supervised model. Quantitative and qualitative evaluations on two public RSI datasets and additional NSI datasets across five metrics show that our SARNet is superior to 14 state-of-the-art (SOTA) methods without any post-processing.

Highlights

  • In recent years, with the continuous improvement of aerial remote sensing and sensor technology, it becomes more and more convenient to obtain very high resolution (VHR)optical remote sensing images (RSI), which, to a certain extent, meets the urgent needs of scene analysis and object detection in airborne earth observation tasks

  • This paper explores salient object detection in complex optical remote sensing scenes and tries to solve the challenging problems of inaccurate location and the unclear edge of salient objects

  • We propose a novel semantic-guided attention refinement network for salient object detection (SOD) of optical RSI, which is an end-to-end encoding-decoding network architecture

Read more

Summary

Introduction

Optical remote sensing images (RSI), which, to a certain extent, meets the urgent needs of scene analysis and object detection in airborne earth observation tasks. How to effectively improve the efficiency and accuracy of scene analysis and rapid object detection of massive optical remote sensing data with cluttered backgrounds is crucial for further exploration and application of RSI. The goal of object-level salient object detection (SOD) is to locate and separate the most attractive regions from the scene, which is a simulated representation of visual attention mechanism [5]. In the large-scale optical RSI with cluttered background and intricate noise, only a small number of regions with great color, shape, or texture differences can attract people’s attention.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.