Abstract

Attention mechanisms have been shown to play a crucial role in enhancing visual perception tasks. However, in most existing approaches, channel and spatial attention maps are estimated separately without considering the varying importance of each other. This results in a coarse attention weight for objects of interest from a holistic 3-D perspective. To address this issue, we propose a novel Parameter-free Spatial Intersection Attention Module (SIAM), which estimates 3D attention maps with spatial intersection using a parameter-free way. Specifically, SIAM first generates two independent mean queries from two spatial axes and views input as keys. Then, by computing a dot product between these mean queries and keys, SIAM generates two cross-dimension (channel and spatial) attention maps from two spatial directions and combines them into 3-D attention maps. By doing so, the produced attention maps reason important areas with spatial intersection, which can capture location-aware information to facilitate difficult objects’ location in the images. We evaluate our method in image classification, object detection, and object segmentation tasks. Extensive experimental results consistently demonstrate our approach is superior to its counterparts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call