Understanding remote sensing imagery like reading a text document: What can remote sensing image captioning offer?

Xiao Huang,Kaixuan Lu,Siqin Wang,Junyu Lu,Xiao Li,Ruiqian Zhang

doi:10.1016/j.jag.2024.103939

Abstract

Remote sensing imagery offers intricate and nuanced data, emphasizing the need for a profound understanding of the relationships among varied geographical elements and events. In this study, we explore the transitions from the image domain to the text domain by employing four state-of-the-art image captioning algorithms, i.e., BLIP, mPLUG, OFA, and X-VLM. Specifically, we investigate (1) the stability of these image captioning algorithms for remote sensing image captioning, (2) the preservation of similarity between images and their corresponding captions, and (3) the characteristics of their caption embedding spaces. The results suggest a moderate consistency across generated captions from different image captioning models, with observable variations contingent upon the urban entities presented. In addition, a dynamic relationship emerges between image space and the corresponding caption space, evidenced by their fluctuated correlation coefficient. Most importantly, patterns within the caption embedding space align with the observed land cover and land use in the image patches, reaffirming the potential of our pilot work as an impactful analytical approach in future remote sensing analytics. We advocate that integrating image captioning techniques with remote sensing imagery paves the way for an innovative data extraction and interpretation approach with diverse applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Understanding remote sensing imagery like reading a text document: What can remote sensing image captioning offer?

Abstract

Talk to us

Similar Papers

More From: International Journal of Applied Earth Observation and Geoinformation

Lead the way for us

Journal: International Journal of Applied Earth Observation and Geoinformation	Publication Date: Jun 3, 2024
License type: cc-by

Similar Papers

Remote sensing image caption generation via transformer and reinforcement learning
Xiangqing Shen ... Jiaqi Zhao
Multimedia Tools and Applications | VOL. 79
Xiangqing Shen, et. al.Xiangqing Shen ... Jiaqi Zhao
17 Jul 2020
Multimedia Tools and Applications | VOL. 79

A Scientometric Visualization Analysis of Image Captioning Research From 2010 to 2020
Wenxuan Liu ... Xiaoqiang Cheng
IEEE Access | VOL. 9
Wenxuan Liu, et. al.Wenxuan Liu ... Xiaoqiang Cheng
01 Jan 2020
A Scientometric Visualization Analysis of Image Captioning Research From 2010 to 2020
Wenxuan Liu ... Xiaoqiang Cheng

Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning
Xiangqing Shen ... Mingming Liu
Knowledge-Based Systems | VOL. 203
Xiangqing Shen, et. al.Xiangqing Shen ... Mingming Liu
23 Apr 2020
Knowledge-Based Systems | VOL. 203

Multi-Scale Remote Sensing Semantic Analysis Based on a Global Perspective
Wei Cui ... Ziwei Wang
ISPRS International Journal of Geo-Information | VOL. 8
Wei Cui, et. al.Wei Cui ... Ziwei Wang
17 Sep 2019
ISPRS International Journal of Geo-Information | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Understanding remote sensing imagery like reading a text document: What can remote sensing image captioning offer?

Abstract

Talk to us

Similar Papers

More From: International Journal of Applied Earth Observation and Geoinformation