Abstract

Convolutional Neural Networks (CNNs) can reveal local variation details and multi-scale spatial context in images via low-to-high stages of feature expression; effective fusion of these raw features is key to Semantic Edge Detection (SED). The methods available in the field generally fuse features across stages in a position-aligned mode, which cannot satisfy the requirements of diverse semantic context in categorizing different pixels. In this paper, we propose a deep framework for SED, the core of which is a new multi-stage feature fusion structure, called All-HiS-In ACA (All-Higher-Stages-In Adaptive Context Aggregation). All-HiS-In ACA can adaptively select semantic context from all higher-stages for detailed features via a cross-stage self-attention paradigm, and thus can obtain fused features with high-resolution details for edge localization and rich semantics for edge categorization. In addition, we develop a non-parametric Inter-layer Complementary Enhancement (ICE) module to supplement clues at each stage with their counterparts in adjacent stages. The ICE-enhanced multi-stage features are then fed into the All-HiS-In ACA module. We also construct an Object-level Semantic Integration (OSI) module to further refine the fused features by enforcing the consistency of the features within the same object. Extensive experiments demonstrate the superior performance of the proposed method over state-of-the-art works.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call