Abstract

Multimodal image fusion combines information from multiple modalities to generate a composite image containing complementary information. Multimodal image fusion is challenging due to the heterogeneous nature of data, misalignment and nonlinear relationships between input data, or incomplete data during the fusion process. In recent years, several attention mechanisms have been introduced to enhance the performance of deep learning models. However, little literature is available on multimodal image fusion using attention mechanisms. This paper aims to study and analyze the latest deep-learning approaches, including attention mechanisms for multimodal image fusion. As a result of this study, the graphical taxonomy based on the different image modalities, various fusion strategies, fusion levels, and metrics for fusion tasks has been put forth. The focus has been on various Multimodal image fusion frameworks based on deep-learning techniques as their core methodology. This paper also sheds light on the challenges and future research directions in this field, application domains, and benchmark datasets used for multimodal fusion tasks. This paper contributes to the research on Multimodal image fusion and can help researchers select a suitable methodology for their applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.