To describe the content of image: The view from image captioning

Xiaohan Hou

doi:10.54254/2755-2721/5/20230511

Abstract

The aim of developing the technology of "image captioning," which integrates natural language and computer processing, is to automatically give descriptions for photographs by the machine itself. The work can be separated into two parts, which depends on correctly comprehending both language and images from a semantic and syntactic perspective. In light of the growing body of information on the subject, it is getting harder to stay abreast of the most recent advancements in the area of image captioning. Nevertheless, the review papers that are now available don't go into enough detail about those findings. The approaches, benchmarks, datasets, and assessment metrics currently in use for picture captioning are reviewed in this work. The majority of the field's ongoing study is concentrated on robust learning-based techniques, where deep reinforcement, adversarial learning, and attention processes all seem to be at the heart of this research area. Image captioning entails a brand-new field in research on computer vision. Generating a comprehensive natural language description for the source images is the fundamental issue of image captioning. This essay explores and evaluates earlier work on image captioning. Image captioning's application and task situations are introduced. The merits and disadvantages of each approach are explored after the analysis of the image captioning algorithms based on encoder-decoder and template structure. The assessment and baseline dataset for picture captioning are therefore shown. Ultimately, prospects for image captioning's progress are presented.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

To describe the content of image: The view from image captioning

Abstract

Talk to us

Similar Papers

More From: Applied and Computational Engineering

Lead the way for us

Similar Papers

Deep Learning in Natural Language Generation from Images
Xiaodong He ... Li Deng
-
Xiaodong He, et. al.Xiaodong He ... Li Deng
01 Jan 2018
01 Jan 2018

Automated Image Captioning with Multi-layer Gated Recurrent Unit
Ozge Taylan Moral ... Volkan Kilic
-
Ozge Taylan Moral, et. al.Ozge Taylan Moral ... Volkan Kilic
29 Aug 2022
29 Aug 2022

A Novel Convolutional Neural Network-Gated Recurrent Unit approach for Image Captioning
Sarthak Singh Rawat ... Rahul Nijhawan
-
Sarthak Singh Rawat, et. al.Sarthak Singh Rawat ... Rahul Nijhawan
01 Aug 2020
01 Aug 2020

Synthesis of Vision and Language: Multifaceted Image Captioning Application
Arpit Gupta ... Himanshu Goyal
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07
Arpit Gupta, et. al.Arpit Gupta ... Himanshu Goyal
23 Dec 2023
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

To describe the content of image: The view from image captioning

Abstract

Talk to us

Similar Papers

More From: Applied and Computational Engineering