Abstract

In recently years, some useful unsupervised video object segmentation methods that emphasize the common information in videos have been proposed. Despite the effectiveness of these methods, they ignore the information from the shallow layers of the network and thus fail to segment the details of the objects. To address this problem, we propose a multi-attention network for unsupervised video object segmentation (MANet). Recent studies show that the deep layers of networks are sensitive to high-level semantic information but messy details, while it is opposite for shallow layers. From this insight, a multi-attention module is designed by taking into account the information from the shallow layers in addition to that from the deep layers. This module can distinguish the primary object and segment the details of the object effectively by enhancing the common information between video frames while combing the features from the shallow layers and the deep layers. Experimental results on the DAVIS-2016 and SegTrack v2 datasets show that our network outperforms the state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.