Multi-label remote sensing classification with self-supervised gated multi-modal transformers.

Na Liu,Ye Yuan,Guodong Wu,Sai Zhang,Jie Leng,Lihong Wan

doi:10.3389/fncom.2024.1404623

Abstract

With the great success of Transformers in the field of machine learning, it is also gradually attracting widespread interest in the field of remote sensing (RS). However, the research in the field of remote sensing has been hampered by the lack of large labeled data sets and the inconsistency of data modes caused by the diversity of RS platforms. With the rise of self-supervised learning (SSL) algorithms in recent years, RS researchers began to pay attention to the application of "pre-training and fine-tuning" paradigm in RS. However, there are few researches on multi-modal data fusion in remote sensing field. Most of them choose to use only one of the modal data or simply splice multiple modal data roughly. In order to study a more efficient multi-modal data fusion scheme, we propose a multi-modal fusion mechanism based on gated unit control (MGSViT). In this paper, we pretrain the ViT model based on BigEarthNet dataset by combining two commonly used SSL algorithms, and propose an intra-modal and inter-modal gated fusion unit for feature learning by combining multispectral (MS) and synthetic aperture radar (SAR). Our method can effectively combine different modal data to extract key feature information. After fine-tuning and comparison experiments, we outperform the most advanced algorithms in all downstream classification tasks. The validity of our proposed method is verified.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-label remote sensing classification with self-supervised gated multi-modal transformers.

Abstract

Talk to us

Similar Papers

More From: Frontiers in computational neuroscience

Lead the way for us

Journal: Frontiers in computational neuroscience	Publication Date: Sep 24, 2024
License type: CC BY 4.0

Similar Papers

Research progress on electronic health records multimodal data fusion based on deep learning
Yong Fan ... Jing Wang
Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi | VOL. 41
Yong Fan, et. al.Yong Fan ... Jing Wang
25 Oct 2024
Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi | VOL. 41

Multimodal Fusion of Brain Imaging Data: A Key to Finding the Missing Link(s) in Complex Mental Illness
Vince D Calhoun ... Jing Sui
Biological Psychiatry: Cognitive Neuroscience and Neuroimaging | VOL. 1
Vince D Calhoun, et. al.Vince D Calhoun ... Jing Sui
07 Jan 2016
Biological Psychiatry: Cognitive Neuroscience and Neuroimaging | VOL. 1

Applying Self-Supervised Learning to Medicine: Review of the State of the Art and Medical Implementations
Alexander Chowdhury ... Jacob Rosenthal
Informatics | VOL. 8
Alexander Chowdhury, et. al.Alexander Chowdhury ... Jacob Rosenthal
10 Sep 2021
Informatics | VOL. 8

Research on Emotion Classification Based on Multi-modal Fusion
Zhihua Xiang ... Nor Haizan Mohamed Radzi
Baghdad Science Journal | VOL. 21
Zhihua Xiang, et. al.Zhihua Xiang ... Nor Haizan Mohamed Radzi
25 Feb 2024
Baghdad Science Journal | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-label remote sensing classification with self-supervised gated multi-modal transformers.

Abstract

Talk to us

Similar Papers

More From: Frontiers in computational neuroscience