Abstract

We propose an Inner-Imaging three-dimensional (3D) attentional feature fusion module for a residual network, which is a simple yet effective approach for residual networks. In our attention module, we constructed a 3D soft attention feature map to refine the input feature. The map fuses the attentional features from different dimensions, including channel and spatial axes, to create a 3D attention map. Then, we implemented a feature fusion module to further fuse the attentional features. Lastly, the attention module outputs a 3D soft attention map that is applied to the residual branch. The attention module can also model the relationship between attentional features from different dimensions and achieve the interaction between attentional features. This function allows our attention module to acquire more attentional features. To demonstrate the effectiveness of our method, extensive experiments were conducted on several computer vision benchmark datasets, including ImageNet 2012 and Microsoft COCO (MS COCO) 2017 datasets. The experimental results show that our method performed better than the baseline methods in the tasks of image classification, object detection, and instance segmentation tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call