Abstract

Convolutional neural networks (CNN) have made rapid progress for a series of visual tasks, but the bottom-up convolutional feature extraction process is not qualified to mimic the human visual perception process, which takes advantages of discriminative features. Although attention modules extract the top-down discriminative features and have been widely investigated in the past few years, current attention modules interrupt the bottom-up convolutional feature extraction process. To tackle this challenge, in this paper, we introduce dense connection structure to fuse discriminative features from attention modules and convolutional features, which we term as dense attention learning. Also, to alleviate the over-fitting problem caused by rapid feature dimension growth, we propose a channel-wise attention module to compress and refine the convolutional features. Based on these strategies, we build a dense attention convolutional neural network (DA-CNN) for visual recognition. Exhaustive experiments on four challenging datasets including CIFAR-10, CIFAR-100, SVHN and ImageNet demonstrate that our DA-CNN outperforms many state-of-the-art methods. Moreover, the effectiveness of our dense attention learning and channel-wise attention module is also validated.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.