Abstract

Shadow detection is significant for scene understanding. Ambiguities in a shadow image, such as shadow-like non-shadow regions and shadow regions with non-shadow patterns, are still very challenging for prevalent CNN-based methods. This work attempts to alleviate this problem from a new perspective of shape semantics, and then proposes a Semantic-aware Transformer (SaT) in a multi-task learning manner. Concretely, we first propose a shadow detection network based on the recent progress of Transformer architecture, allowing us to capture significant global interactions between contexts. Next, we design a multi-task learning framework, combining shadow supervision and semantic supervision to perform a semantic-aware shadow detection. Finally, we introduce a simple yet effective information buffer unit to overcome the gradient signal conflict from multi-task learning. Experimental results on three public benchmark datasets (i.e., ISTD, SBU, and UCF) show that our SaT can effectively detect ambiguous cases and achieve state-of-the-art results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call