Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning

Zhenghang Yuan,Qi Wang,Xuelong Li

doi:10.1109/access.2019.2962195

Abstract

Remote sensing image captioning, which aims to understand high-level semantic information and interactions of different ground objects, is a new emerging research topic in recent years. Though image captioning has developed rapidly with convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the image captioning task for remote sensing images still suffers from two main limitations. One limitation is that the scales of objects in remote sensing images vary dramatically, which makes it difficult to obtain an effective image representation. Another limitation is that the visual relationship in remote sensing images is still underused, which should have great potential to improve the final performance. In order to deal with these two limitations, an effective framework for captioning the remote sensing image is proposed in this paper. The framework is based on multi-level attention and multi-label attribute graph convolution. Specifically, the proposed multi-level attention module can adaptively focus not only on specific spatial features, but also on features of specific scales. Moreover, the designed attribute graph convolution module can employ the attribute-graph to learn more effective attribute features for image captioning. Extensive experiments are conducted and the proposed method achieves superior performance on UCM-captions, Sydney-captions and RSICD dataset.

Highlights

With the great progress of remote sensing technology, high-quality remote sensing images are captured more which provides a large number of available data for researches [1], [2]
Remote sensing image captioning, which aims to understand the high-level semantic information and the interactions of different ground objects, The associate editor coordinating the review of this manuscript and approving it for publication was Po Yang
In this work, a remote sensing image captioning framework based on multi-level attention and multi-label attribute graph convolution is proposed to improve the performance from two aspects

Summary

Introduction

With the great progress of remote sensing technology, high-quality remote sensing images are captured more which provides a large number of available data for researches [1], [2]. Generating natural-language descriptions for remote sensing image can provide richer high-level semantic information, such as scene structures or object relationships. Remote sensing image captioning, which aims to understand the high-level semantic information and the interactions of different ground objects, The associate editor coordinating the review of this manuscript and approving it for publication was Po Yang. It provides far richer descriptions of remote sensing scene in a highersemantic level by generating a corresponding sentence to abstract the content. Accurate and flexible sentences are generated automatically to describe the content of remote sensing images. Remote sensing image captioning identifies the ground objects under different levels and analyzes their attributes and spatial relationships in the aerial view [7]. The interactions between objects are visual relationships which are embedded in image captions. The caption ‘‘Some white planes are in an airport’’ describes the visual relationship between planes and airport

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 34	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

DFEN: Dual Feature Enhancement Network for Remote Sensing Image Caption
Weihua Zhao ... Fuyuan Wei
Electronics | VOL. 12
Weihua Zhao, et. al.Weihua Zhao ... Fuyuan Wei
25 Mar 2023
Electronics | VOL. 12

Exploring Transformer and Multilabel Classification for Remote Sensing Image Captioning
Hitesh Kandala ... Sudipan Saha
IEEE Geoscience and Remote Sensing Letters | VOL. 19
Hitesh Kandala, et. al.Hitesh Kandala ... Sudipan Saha
01 Jan 2021
IEEE Geoscience and Remote Sensing Letters | VOL. 19

Improving Remote Sensing Image Captioning by Combining Grid Features and Transformer
Shuo Zhuang ... Gang Wang
IEEE Geoscience and Remote Sensing Letters | VOL. 19
Shuo Zhuang, et. al.Shuo Zhuang ... Gang Wang
01 Jan 2021
IEEE Geoscience and Remote Sensing Letters | VOL. 19

An Image Sentence Generation Based on Deep Neural Network Using RCNN-LSTM Model
S Sai Satyanarayana Reddy ... M Jyaram
-
S Sai Satyanarayana Reddy, et. al.S Sai Satyanarayana Reddy ... M Jyaram
26 Nov 2021
26 Nov 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access