Abstract
AbstractThe salient object detection task tries to simulate the human visual system for most eye‐catching objects or regions detection. However, due to the complexity of the visual mechanisms, current methods will suffer from severe performance degradation, leading to inconsistent prediction results for the same regions, when directly adopting a model trained on a fixed resolution to evaluate at other different resolutions. Considering that consistency in predictions is essential for salient object detection, a cross‐scale resolution consistent salient object detection method, called RCNet, is proposed. Specifically, to enhance the model's capacity for generalization across images of varying resolutions and make the model implicitly learn the scale invariance, a multi‐resolution data enhancement module is constructed to generate images with arbitrary resolutions for the same scene. Moreover, to accomplish better multi‐level feature fusion, a cross‐scale fusion module is developed to fuse high‐level semantic features and low‐level detail features. Additionally, to explicitly learn the scale invariance of the salient scores, a hybrid salient consistency loss is formulated on salient object detection with different resolutions. Comprehensive evaluations on five benchmark datasets show that RCNet achieves a highly competitive result.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.