Remote sensing image description based on word embedding and end-to-end deep learning

Yuan Wang,Kuerban Alifu,Hongbing Ma,Yalong Lv

doi:10.1038/s41598-021-82704-4

Yuan Wang, Kuerban Alifu + Show 2 more

Open Access

https://doi.org/10.1038/s41598-021-82704-4

Copy DOI

Journal: Scientific Reports	Publication Date: Feb 4, 2021
Citations: 5	License type: open-access

Affiliation: Xinjiang University, Tsinghua University

Abstract

This study proposes an end-to-end image description generation model based on word embedding technology to realise the classification and identification of Populus euphratica and Tamarix in complex remote sensing images by providing descriptions in precise and concise natural sentences. First, category ambiguity over large-scale regions in remote sensing images is addressed by introducing the co-occurrence matrix and global vectors for word representation to generate the word vector features of the object to be identified. Second, a new multi-level end-to-end model is employed to further describe the content of remote sensing images and to better advance the description tasks for P. euphratica and Tamarix in remote sensing images. Experimental results reveal that the natural language sentences generated using this method can better describe P. euphratica and Tamarix in remote sensing images compared with conventional deep learning methods.

Highlights

This study proposes an end-to-end image description generation model based on word embedding technology to realise the classification and identification of Populus euphratica and Tamarix in complex remote sensing images by providing descriptions in precise and concise natural sentences
The retrieval performance of the algorithm was improved by combining a sparse automatic encoder with convolutional neural networks (CNNs), which reduced the time required for labelling and improved the operational efficiency of the model
The validity of the approach was demonstrated, revealing that the model could effectively extract the semantic information of objects of interest and better describe the contents of remote sensing images. Scarpa[15] designed a very compact architecture using a CNN to achieve precise training of small-sized data sets; a good recognition effect was obtained for images derived from various multi-resolution sensors. Maggiori[16] proposed a spatially fine classification algorithm based on the pixel semantics of images obtained from aeronautical satellites in conjunction with a deep CNN

Summary

Decoding LSTM IndRNN

The optimum F-score value of 0.9069 at Table 3 is obtained when the forward propagation of the coding layer is the IndRNN and the backward propagation is the Bi-LSTM network This is because the neurons in the IndRNN are independent and facilitate the cross-layer transmission of information, which can better learn hidden details. The IndRNN-F + LSTM-B combination provides smaller P, R and F values than the IndRNN-F + BiRNNB combination because a single LSTM network can learn long sequences effectively; it ignores semantic information between some of the pixels in the fixed window and the global image As such, it is not well suited for describing image contents. 1. The proposed annotation strategy provides superior P, R and F values to the conventional scheme because the conventional method adopts single pixels for labelling, which ignores the correlation between adjacent pixels and cannot mine the overall semantic information of an image. The resolution of QuickBird image data is less than that of the UVA images, the spectral band is greater; this enhances the recognition effect for QuickBird images

Spectrum Texture Fusion Word vector Original features

Conclusion

Additional information

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Remote sensing image description based on word embedding and end-to-end deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning
Marcos Garcia
Computational Linguistics | VOL. 47
Marcos GarciaMarcos Garcia
03 Nov 2021
Computational Linguistics | VOL. 47

Monitoring of Soybean Maturity Using UAV Remote Sensing and Deep Learning
Shanxin Zhang ... Jibo Yue
Agriculture | VOL. 13
Shanxin Zhang, et. al.Shanxin Zhang ... Jibo Yue
30 Dec 2022
Agriculture | VOL. 13

Prediction of knee biomechanics with different tibial component malrotations after total knee arthroplasty: conventional machine learning vs. deep learning.
Qida Zhang ... Ling Qin
Frontiers in bioengineering and biotechnology | VOL. 11
Qida Zhang, et. al.Qida Zhang ... Ling Qin
08 Jan 2024
Frontiers in bioengineering and biotechnology | VOL. 11

Comparison of Soil Total Nitrogen Content Prediction Models Based on Vis-NIR Spectroscopy
Yueting Wang ... Minjuan Wang
Sensors | VOL. 20
Yueting Wang, et. al.Yueting Wang ... Minjuan Wang
10 Dec 2020
Sensors | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Remote sensing image description based on word embedding and end-to-end deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports