Attention mechanisms in computer vision: A survey

Meng-Hao Guo,Jiang-Jiang Liu,Ralph R Martin,Peng-Tao Jiang,Ming-Ming Cheng,Song-Hai Zhang,Shi-Min Hu,Zheng-Ning Liu,Tian-Xing Xu,Tai-Jiang Mu

doi:10.1007/s41095-022-0271-y

Meng-Hao Guo, Jiang-Jiang Liu + Show 8 more

Open Access

https://doi.org/10.1007/s41095-022-0271-y

Copy DOI

Abstract

Humans can naturally and effectively find salient regions in complex scenes. Motivated by this observation, attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system. Such an attention mechanism can be regarded as a dynamic weight adjustment process based on features of the input image. Attention mechanisms have achieved great success in many visual tasks, including image classification, object detection, semantic segmentation, video understanding, image generation, 3D vision, multimodal tasks, and self-supervised learning. In this survey, we provide a comprehensive review of various attention mechanisms in computer vision and categorize them according to approach, such as channel attention, spatial attention, temporal attention, and branch attention; a related repository https://github.com/MenghaoGuo/Awesome-Vision-Attentions is dedicated to collecting related work. We also suggest future directions for attention mechanism research.

Highlights

IntroductionManuscript received: 2021-12-31; accepted: 2022-01-18 regions of an image and disregarding irrelevant parts are called attention mechanisms; the human visual system uses one [1–4] to assist in analyzing and understanding complex scenes efficiently and effectively
Methods for diverting attention to the most importantManuscript received: 2021-12-31; accepted: 2022-01-18 regions of an image and disregarding irrelevant parts are called attention mechanisms; the human visual system uses one [1–4] to assist in analyzing and understanding complex scenes efficiently and effectively
Chaudhari et al [141] provided a survey of attention models in deep neural networks which concentrates on their application to natural language processing, while our work focuses on computer vision

Summary

Introduction

Manuscript received: 2021-12-31; accepted: 2022-01-18 regions of an image and disregarding irrelevant parts are called attention mechanisms; the human visual system uses one [1–4] to assist in analyzing and understanding complex scenes efficiently and effectively. This in turn has inspired researchers to introduce attention mechanisms into computer vision systems to improve their performance. The first phase begins from RAM [31], pioneering work that combined deep neural networks with attention mechanisms It recurrently predicts the important region and updates the whole network in an end-to-end manner through a policy gradient.

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Visual Media	Publication Date: Mar 15, 2022
Citations: 858	License type: open-access

R Discovery Prime

R Discovery Prime

Attention mechanisms in computer vision: A survey

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Visual Media

Lead the way for us

Similar Papers

HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object Detection.
Ya-Li Li ... Shengjin Wang
IEEE Transactions on Image Processing | VOL. 29
Ya-Li Li, et. al.Ya-Li Li ... Shengjin Wang
11 Dec 2019
IEEE Transactions on Image Processing | VOL. 29

A Streamlined Attention Mechanism for Image Classification and Fine-Grained Visual Recognition
Dakshayani D Himabindu ... Praveen S Kumar
MENDEL | VOL. 27
Dakshayani D Himabindu, et. al.Dakshayani D Himabindu ... Praveen S Kumar
21 Dec 2021
MENDEL | VOL. 27

Mixed Attention Mechanism for Small-Sample Fine-grained Image Classification
Xiaoxu Li ... Jie Cao
-
Xiaoxu Li, et. al.Xiaoxu Li ... Jie Cao
01 Nov 2019
01 Nov 2019

Efficient Attention Pyramid Network for Semantic Segmentation
Qirui Yang ... Kunyuan Hu
IEEE Access | VOL. 9
Qirui Yang, et. al.Qirui Yang ... Kunyuan Hu
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Attention mechanisms in computer vision: A survey

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Visual Media