Abstract

Recently channel attention mechanism has been widely used to improve the performance of convolutional neural networks. However, most channel attention mechanisms applied to the backbone convolutional neural networks of the computer vision use the global pooling features of the output from each block to obtain the channel attention weights of corresponding channels, ignore the spatial information of the corresponding original features and the potential relationship between adjacent layers. For insufficient utilization of space information of origin features and inability to adaptively learn the potential association of all features in a block before the process of producing channel attention weights, we propose a new Cross-layer Channel Attention Mechanism(CCAM), in which a matrix with spatial information is used to replace the global pooling operation, uses the input and output features of each block as the inputs, and outputs the channel attention weights of corresponding features simultaneously. Compared with other attention mechanisms, the CCAM have the following three advantages: first, it makes full use of the spatial information of each layer of features; second, it encourages feature reuse and fusion; third, it is better at discovering the potential relationship between the features of different layers in a block. Our simulation results have demonstrated that CCAM can effectively extract the attention weights of diffident layers, and achieve better performance on CIFAR- 10, CIFAR-100, ImageNet-1K, MS COCO detection, and VOC detection with small additional computational cost compared with the corresponding convolutional neural network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call