Related Topics
Articles published on Multi-modal Learning
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
5380 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.psychres.2026.117090
- Jun 1, 2026
- Psychiatry research
- Shiau-Shian Huang + 6 more
The diagnosis of Major Depressive Disorder (MDD) relies heavily on subjective clinical assessments. This study evaluated various machine learning models in differentiating between MDD patients and healthy controls using resting-state electroencephalography (EEG) features and clinical variables as input variables. A total of 123 participants, including 77 MDD patients and 46 sex- and age-matched controls underwent resting-state EEG recording and 11 standardized clinical assessments. From each EEG, we extracted absolute and relative power and functional connectivity metrics, including phase locking value, phase lag index (PLI), and weighted PLI across standard EEG frequency bands. Nineteen ML classifiers were evaluated using leave-one-out cross-validation. The best-performing EEG-only model using absolute power with a medium KNN classifier achieved an area under curve (AUC) of 0.876. The best-performing clinical-only model yielded an AUC of 0.849. A combined model integrating EEG absolute power and clinical variables further improved performance (AUC = 0.896). Integrating EEG features with clinical data significantly enhanced the MDD classification performance. These findings support the potential of using multimodal data fusion and machine learning to develop objective diagnostic tools for the assessment of psychiatric disorders.
- New
- Research Article
1
- 10.1016/j.sasc.2026.200470
- Jun 1, 2026
- Systems and Soft Computing
- Jing Zhou
Intelligent matching methods for educational resources under a multimodal deep learning framework
- New
- Research Article
- 10.1016/j.artmed.2026.103395
- Jun 1, 2026
- Artificial intelligence in medicine
- Xin Zhang + 7 more
Acute Coronary Syndromes (ACS), including ST- and non-ST-segment elevation myocardial infarction (STEMI, NSTEMI), remain a leading cause of global mortality. Traditional Cardiovascular Risk Scores (CVRS) provide important insights but mainly rely on clinical data, often neglecting environmental factors (e.g.air pollution, climate) that significantly influence cardiovascular health. Integrating complex time-series environmental and clinical datasets also presents substantial challenges. We propose TabulaTime, a multimodal deep learning framework integrating clinical risk factors with environmental data to enhance ACS risk prediction. TabulaTime delivers three innovations: multimodal integration of time-series environmental and clinical data; PatchRWKV for extracting complex temporal patterns with linear computational complexity; and enhanced interpretability through attention mechanisms. TabulaTime improves prediction accuracy by 20.5% over CatBoost, with environmental data contributing a 10.1% gain. PatchRWKV outperforms state-of-the-art models (MLP-, CNN-, RNN- and Transformer-based models). Feature analysis highlights key clinical and environmental predictors. This approach advances personalised prevention and strengthens public health against cardiovascular risks.
- New
- Research Article
- 10.1016/j.ijmedinf.2026.106348
- Jun 1, 2026
- International journal of medical informatics
- Shidi Miao + 8 more
GAST-NET: A multi-modal and multi-task deep learning framework for preoperative prediction of perineural invasion and prognostic risk in gastric cancer.
- New
- Research Article
- 10.1016/j.ejrad.2026.112758
- Jun 1, 2026
- European journal of radiology
- Zhiqiang Wan + 8 more
Multi-modal deep learning model for predicting recurrence of moderately severe and severe acute pancreatitis.
- New
- Research Article
- 10.1016/j.mlwa.2026.100891
- Jun 1, 2026
- Machine Learning with Applications
- Mason Li + 3 more
Accurate and early diagnosis of Alzheimer’s disease (AD) is critical for effective intervention, disease monitoring, and patient care. Traditional diagnostic approaches rely on a single modality, such as clinical assessments, neuroimaging, or genetic markers, which may fail to capture the complex, multifaceted nature of AD. Multimodal learning has therefore been explored to integrate complementary information across data sources. However, conventional fusion strategies, including early feature concatenation and late decision-level fusion, often model modalities independently and fail to capture high-order cross-modal interactions. To address these limitations, we propose a multimodal tensor fusion network (MTFN) that integrates heterogeneous data sources, including visual imagery, demographics, and longitudinal time-series data, to enhance AD recognition. Our approach leverages tensor representations to model intricate cross-modal interactions while preserving structural dependencies within each modality. Experimental results on publicly available AD datasets demonstrate that the proposed method outperforms the accuracy of the state-of-the-art deep learning classification. This work highlights the potential of tensor-based multimodal learning to advance precision medicine for neurodegenerative diseases.
- New
- Research Article
- 10.1016/j.media.2026.104041
- Jun 1, 2026
- Medical image analysis
- Qiaoyu Han + 7 more
Clinical priors-inspired privileged knowledge distillation for reliable pancreatic lesion classification.
- New
- Research Article
- 10.1016/j.eswa.2026.131541
- Jun 1, 2026
- Expert Systems with Applications
- Jian Zhu + 7 more
Curriculum trustworthy multi-modal learning
- New
- Research Article
- 10.1016/j.pnpbp.2026.111702
- Jun 1, 2026
- Progress in neuro-psychopharmacology & biological psychiatry
- Simin Kang + 5 more
Predicting adult functional outcomes in childhood-onset attention-deficit/hyperactivity disorder using multimodal MRI and machine learning: A prospective follow-up study.
- New
- Research Article
- 10.1016/j.saa.2026.127623
- Jun 1, 2026
- Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy
- Yu Sun + 6 more
A transformer and 3D CNN-based feature fusion network with interpretable ability for Raman spectra analysis: improving the diagnosis of thyroid cancer.
- New
- Research Article
- 10.1016/j.jad.2026.121259
- Jun 1, 2026
- Journal of affective disorders
- Soonho Ha + 5 more
Multimodal machine learning models for predicting remission in major depressive disorder using clinical data, blood biomarkers, and DNA methylation.
- New
- Research Article
- 10.1016/j.drugalcdep.2026.113128
- Jun 1, 2026
- Drug and alcohol dependence
- Linqi Lu + 9 more
Foodie traps within facebook cannabis promotional posts: Deploying multimodal deep learning AIs to monitor audience engagement.
- New
- Research Article
1
- 10.1016/j.cexr.2025.100126
- Jun 1, 2026
- Computers & Education: X Reality
- Yohan Hwang + 1 more
An exploratory study on the development of pre-service English teachers’ VR-contextualized testing prototype
- New
- Research Article
- 10.1016/j.bspc.2026.109885
- Jun 1, 2026
- Biomedical Signal Processing and Control
- Shenglan Zhong + 5 more
A multimodal deep learning framework for hemiplegic gait recognition using skeleton and wearable sensor data
- New
- Research Article
- 10.1016/j.eswa.2026.131802
- Jun 1, 2026
- Expert Systems with Applications
- Tan Cheng + 4 more
Tell popular from unpopular: A popularity-guided bipolar multimodal interactive prototype learning method for micro-video popularity prediction
- New
- Research Article
- 10.1016/j.bspc.2026.109931
- Jun 1, 2026
- Biomedical Signal Processing and Control
- Mohammad Khaleel Sallam Ma’Aitah + 2 more
VitalGuard-AI: a real-time multi-modal deep learning framework for intelligent health monitoring using wearable IoT devices
- New
- Research Article
- 10.1016/j.media.2026.104012
- Jun 1, 2026
- Medical image analysis
- Junhao Wu + 6 more
Multimodal medical endoscopic image analysis via progressive disentangle-aware contrastive learning.
- New
- Research Article
- 10.1016/j.aiia.2026.03.001
- Jun 1, 2026
- Artificial Intelligence in Agriculture
- Shahram Hamza Manzoor + 9 more
Pollination optimization in apple orchards faces increasing challenges from climate variability and declining pollinator populations, necessitating precision timing strategies. This study introduces a novel Pollination Importance Index (PII) integrated with a hybrid multi-task deep learning framework (PII-CNN-LSTM) to identify critical pollination windows. The PII dynamically quantifies pollination potential by incorporating flower receptivity, resource availability, biotic stress, and pollinator activity across five apple flower growth stages. The PII-CNN-LSTM architecture simultaneously performs growth stage classification and importance prediction through CNN spatial feature extraction and LSTM temporal modeling, enhanced by attention mechanisms and residual connections. Comparative evaluation against PII-CNN-BiLSTM, PII-CNN-GRU, and PII-CNN-TCN architectures demonstrated superior performance with 97% classification accuracy and minimal prediction error (validation loss: 0.0065, MAE: 0.0505). The model achieved exceptional full-bloom stage identification (99% F1-score), corresponding to its dominant 61.5% contribution to overall pollination importance. Cross-validation using 2024–2025 ground truth data and real-time drone deployment confirmed robust generalizability with temporal correlations exceeding 0.94. The framework successfully identified the critical pollination window from 3rd to 9th days, with optimal intervention timing at 5th to 7th days when importance scores exceeded 0.40. This biologically-grounded temporal precision enables targeted deployment of pollination resources during peak receptivity periods, reducing the need for continuous monitoring and intervention throughout the entire flowering season. The biologically-grounded approach provides scalable, data-driven decision support for precision agriculture, representing a significant advancement in agricultural automation and orchard productivity optimization. • Developed Pollination Importance Index (PII) integrating key pollination factors. • Identified optimal pollination window at days 5–7 with >0.94 temporal correlations. • PII-CNN-LSTM achieved 97% accuracy, outperforming BiLSTM, GRU, and TCN models. • Real-time drone deployment achieved 90% accuracy with YOLOv8s-PII-CNN-LSTM pipeline. • Six-channel fusion combining RGB imagery, PII score, image labels, and temporal sequences.
- New
- Research Article
- 10.1016/j.neucom.2026.133546
- Jun 1, 2026
- Neurocomputing
- Jin Yang + 1 more
Active multi-modality learning for medical image segmentation via disentangling modalities and evaluating anatomical distribution
- New
- Research Article
- 10.1016/j.bspc.2026.109868
- Jun 1, 2026
- Biomedical Signal Processing and Control
- Jinghui Yao + 8 more
A self-supervised representation-transfer multimodal learning for automated diagnosis of Developmental Dysplasia of the Hip