Deep Multimodal Fusion of Visual and Auditory Features for Robust Material Recognition

Yifei Shi,Shuai Yang,Huei Ruey Ong,Yuxin Fan

doi:10.15837/ijccc.2024.5.6457

Yifei Shi, Shuai Yang + Show 2 more

Open Access

https://doi.org/10.15837/ijccc.2024.5.6457

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

This paper presents a deep neural network incorporating visual and auditory data fusion to enhance material recognition performance. Traditional recognition techniques relying on single data modalities face accuracy and robustness limitations, especially in complex real-world environments. To address these challenges, we develop a multimodal fusion-based model. The proposed approach first extracts features from input images and sounds separately using CNNs and spectral analysis. A concatenation layer then integrates the visual and auditory features. Extensive experiments demonstrate superior material classification over uni-modal methods, with 100% test accuracy across seven material types. The multi-modal fusion model also demonstrates stronger resilience to noise and illumination variations. This research provides a valuable foundation for robust material perception in intelligent systems.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Deep Multimodal Fusion of Visual and Auditory Features for Robust Material Recognition

Abstract

Published Version

Talk to us

Similar Papers

More From: INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL

Lead the way for us

Journal: INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL	Publication Date: Sep 2, 2024
License type: CC BY-NC 4.0

Similar Papers

Multimodal deep fusion model based on Transformer and multi-layer residuals for assessing the competitiveness of weeds in farmland ecosystems
Zhaoxia Lou ... Zhiming Guo
International Journal of Applied Earth Observation and Geoinformation | VOL. 127
Zhaoxia Lou, et. al.Zhaoxia Lou ... Zhiming Guo
29 Jan 2024
International Journal of Applied Earth Observation and Geoinformation | VOL. 127

Vision transformer-based multimodal fusion network for classification of tumor malignancy on breast ultrasound: A retrospective multicenter study.
Mengying Li ... Xinglong Wu
International journal of medical informatics | VOL. 196
Mengying Li, et. al.Mengying Li ... Xinglong Wu
21 Jan 2025
International journal of medical informatics | VOL. 196

Diagnostic efficiency of multi-modal MRI based deep learning with Sobel operator in differentiating benign and malignant breast mass lesions-a retrospective study.
Weixia Tang ... Shenchu Gong
PeerJ. Computer science | VOL. 9
Weixia Tang, et. al.Weixia Tang ... Shenchu Gong
17 Jul 2023
PeerJ. Computer science | VOL. 9

Fusion of CT images and clinical variables based on deep learning for predicting invasiveness risk of stage I lung adenocarcinoma.
Haozhe Huang ... Hong Chen
Medical physics | VOL. 49
Haozhe Huang, et. al.Haozhe Huang ... Hong Chen
15 Aug 2022
Medical physics | VOL. 49

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Deep Multimodal Fusion of Visual and Auditory Features for Robust Material Recognition

Abstract

Published Version

Talk to us

Similar Papers

More From: INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS &amp; CONTROL

More From: INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL