Abstract

In click-based deep interactive segmentation, click encoding and fusion with multi-scale features are vital for manipulating segmentation performance. Existing click encoding methods only incorporate position priors but lack semantics, leading to unstable interaction efficiency. Meanwhile, in order to fuse multi-scale features, current methods extract these features at the abstract semantic level but neglect the constraints imposed by detailed information on semantic features. This oversight makes the network prone to over-segmentation. To address these challenges, we propose a cross-self attention guided by semantic click embedding for interactive segmentation. First, we build semantic click embeddings from the semantic features by embedding positive clicks into continuous connected semantic regions while preserving the role of correction for negative clicks. This enriches the semantic priors for appropriate clicks. Next, we utilize the self-attention mechanism to leverage both detailed and semantic features of the network, constructing a cross-attention mechanism that suppresses the over-segmentation phenomenon. Finally, the semantic click embedding is utilized to weight the affinity matrix of the attention mechanism, ensuring that long-distance dependencies are only relevant to the target of interest. Comprehensive experiments prove that our approach improves interaction efficiency and achieves state-of-the-art performance on public datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.