Semantic-aware Transformer for shadow detection

Kai Zhou,Jing-Long Fang,Wen Wu,Yan-Li Shao,Xing-Qi Wang,Dan Wei

doi:10.1016/j.cviu.2024.103941

Abstract

Shadow detection is significant for scene understanding. Ambiguities in a shadow image, such as shadow-like non-shadow regions and shadow regions with non-shadow patterns, are still very challenging for prevalent CNN-based methods. This work attempts to alleviate this problem from a new perspective of shape semantics, and then proposes a Semantic-aware Transformer (SaT) in a multi-task learning manner. Concretely, we first propose a shadow detection network based on the recent progress of Transformer architecture, allowing us to capture significant global interactions between contexts. Next, we design a multi-task learning framework, combining shadow supervision and semantic supervision to perform a semantic-aware shadow detection. Finally, we introduce a simple yet effective information buffer unit to overcome the gradient signal conflict from multi-task learning. Experimental results on three public benchmark datasets (i.e., ISTD, SBU, and UCF) show that our SaT can effectively detect ambiguous cases and achieve state-of-the-art results.

Full Text