Research on the application of transformer in computer vision

Guoli Bai,Haosen Guo,Chuzhen Xiao

doi:10.1088/1742-6596/2649/1/012033

Abstract

The Transformer is a deep neural network model that utilizes attention mechanisms to improve model performance. Initially, the Transformer gained significant attention in the field of natural language processing. In recent years, due to continuous improvements and extensions to the Transformer model structure, it has also achieved many important breakthroughs in computer vision(CV) tasks, attracting the interest of many researchers. However, there is a lack of comprehensive review articles on the application and development of the Transformer in computer vision. A summary of the Transformer’s applications and advancements in computer vision is given in this paper. It discusses the Transformer model’s fundamental ideas and organizational framework, and primarily introduces its applications in various fields such as image classification, object detection, and image generation, as well as the superiority of the Transformer+ convolutional neural network(CNN) fusion model. The paper provides a detailed analysis of classic models such as Vision Transformer(ViT), Detection Transformer(DETR) and discusses their strengths, weaknesses, and improvement methods. Finally, the paper summarizes and looks forward to the Transformer’s evolution in computer vision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Research on the application of transformer in computer vision

Abstract

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Journal: Journal of Physics: Conference Series	Publication Date: Nov 1, 2023
License type: cc-by

Similar Papers

The Art of Seeing: A Computer Vision Journey into Object Detection
Mohammad Salman Khan ... Ayesha Imran
-
Mohammad Salman Khan, et. al.Mohammad Salman Khan ... Ayesha Imran
06 May 2024
06 May 2024

Model distillation for high-level semantic understanding：a survey
Ruoyu Sun ... Hongkai Xiong
Journal of Image and Graphics | VOL. 28
Ruoyu Sun, et. al.Ruoyu Sun ... Hongkai Xiong
01 Jan 2023
Journal of Image and Graphics | VOL. 28

A survey: object detection methods from CNN to transformer
Ershat Arkin ... Nurbiya Yadikar
Multimedia Tools and Applications | VOL. 82
Ershat Arkin, et. al.Ershat Arkin ... Nurbiya Yadikar
21 Oct 2022
Multimedia Tools and Applications | VOL. 82

A Review of image Classification and Object Detection on Machine learning and Deep Learning Techniques
Ms R.S Sandhya Devi ... V.R Vijay Kumar
-
Ms R.S Sandhya Devi, et. al.Ms R.S Sandhya Devi ... V.R Vijay Kumar
02 Dec 2021
02 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on the application of transformer in computer vision

Abstract

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series