Abstract
In recent years, 2D convolutional neural networks (CNNs) have been extensively used to diagnose neurological diseases from magnetic resonance imaging (MRI) data due to their potential to discern subtle and intricate patterns. Despite the high performances reported in numerous studies, developing CNN models with good generalization abilities is still a challenging task due to possible data leakage introduced during cross-validation (CV). In this study, we quantitatively assessed the effect of a data leakage caused by 3D MRI data splitting based on a 2D slice-level using three 2D CNN models to classify patients with Alzheimer’s disease (AD) and Parkinson’s disease (PD). Our experiments showed that slice-level CV erroneously boosted the average slice level accuracy on the test set by 30% on Open Access Series of Imaging Studies (OASIS), 29% on Alzheimer’s Disease Neuroimaging Initiative (ADNI), 48% on Parkinson’s Progression Markers Initiative (PPMI) and 55% on a local de-novo PD Versilia dataset. Further tests on a randomly labeled OASIS-derived dataset produced about 96% of (erroneous) accuracy (slice-level split) and 50% accuracy (subject-level split), as expected from a randomized experiment. Overall, the extent of the effect of an erroneous slice-based CV is severe, especially for small datasets.
Highlights
Deep learning has become a popular class of machine learning algorithms in computer vision and has been successfully employed in various tasks, including multimedia analysis, natural language processing, and robotics[1]
The worst-case stemmed from the randomly labeled OASIS dataset, which resulted in a model with unacceptably high performances using slice-level CV, whereas classification results obtained using a subject-level CV were about 50%, in accordance with the expected outcomes for a balanced dataset with completely random labels
We showed the performance of three 2D convolutional neural networks (CNNs) models trained with subject-level and slice-level CV data splits to classify Alzheimer’s disease (AD) and Parkinson’s disease (PD) patients from healthy controls using T1-weighted brain magnetic resonance imaging (MRI) data
Summary
Deep learning has become a popular class of machine learning algorithms in computer vision and has been successfully employed in various tasks, including multimedia analysis (image, video, and audio analysis), natural language processing, and robotics[1]. Deep convolutional neural networks (CNNs) hierarchically learn high-level and complex features from input data, eliminating the need for handcrafting features, as in the case of conventional machine learning schemes[2]. The application of these methods in neuroimaging is rapidly growing (see Greenspan et al.[3] and Zaharchuk et al.[4] for reviews). Works applied stacked auto encoders[14,17,18] and deep belief networks[19] to classify neurological patients from healthy subjects using data collected from different neuroimaging modalities, including magnetic resonance imaging (MRI), positron emission tomography (PET), resting-state functional MRI (rsfMRI), and the combination of these m odalities[20]. Wu et al.[23] adopted a pre-trained CaffeNet and achieved accuracy of 98.71%, 72.04%, and 92.35% for Electrical, Electronic, and Information Engineering “Guglielmo Marconi”, University of Bologna, Via dell’Università 50, 47521 Cesena, Italy. 3Unit of Medical Physics, Pisa University Hospital “Azienda Ospedaliero-Universitaria Pisana”, Pisa, Italy. 4Division of Radiology, Versilia Hospital, Azienda USL Toscana Nord Ovest, Lido di Camaiore, LU, Italy. 5These authors contributed : Ekin Yagis and Selamawet Workalemahu Atnafu. 6These authors jointly supervised this work: Alba García Seco de Herrera, Luca Citi and Stefano Diciotti. *email: stefano.diciotti@
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have