Abstract

Attention mechanisms are rapidly being noticed in the computer vision community. However, existing works focus on the design of a single attention module, and then employ the same attention module in all attention layers of an entire network, rendering sub-optimal performance. In this paper, we address a learning-to-attention problem by proposing Switchable Attention (SA), which learns to select different kinds of attention modules for variant blocks of a Deep Neural Network (DNN). SA employs three distinct scopes to compute the attention map, including local spatial attention (LSA), global spatial attention (GSA) and channel attention (CA). Especially, the introduced trainable parameters can be optimized with the network in an end-to-end manner. SA is a light-weight module which can be embedded in existing networks with little overhead of computation. We conduct several quantitative experiments, and SA boosts the performance of the baseline on various challenging benchmarks, such as CIFAR-100, ImageNet-1K, and MS COCO and different computer vision tasks, such as Image Classification and Object detection. Notably, SA outperforms the most attention methods with ResNet-50 on ImageNet-1K.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.