An Overview of the Attention Mechanisms in Computer Vision

Xiao Yang

doi:10.1088/1742-6596/1693/1/012173

Abstract

Deep convolutional neural network (CNN) plays an important role in the field of computer vision and image processing. In order to further improve the performance of CNN, scholars have conducted a series of new explorations, such as the improvement of activation functions, the construction of new loss functions, the regularization of parameters and the development of new network structures. However, every breakthrough of CNN comes from the innovation of network structure, whose design can be inspired by exploring the cognitive process of human brain. As one of the important features of human visual system, visual attention mechanism is essential in image generation, scene classification, target detection and tracking when applied in the field of computer vision. Focusing on the models of attention mechanisms commonly used in computer vision, their categorizations, principles, and outlook are summarized in this overview.

Full Text