Black-box deep learning (DL) models trained for the early detection of Alzheimer’s Disease (AD) often lack systematic model interpretation. This work computes the activated brain regions during DL and compares those with classical Machine Learning (ML) explanations. The architectures used for DL were 3D DenseNets, EfficientNets, and Squeeze-and-Excitation (SE) networks. The classical models include Random Forests (RFs), Support Vector Machines (SVMs), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting (LightGBM), Decision Trees (DTs), and Logistic Regression (LR). For explanations, SHapley Additive exPlanations (SHAP) values, Local Interpretable Model-agnostic Explanations (LIME), Gradient-weighted Class Activation Mapping (GradCAM), GradCAM++ and permutation-based feature importance were implemented. During interpretation, correlated features were consolidated into aspects. All models were trained on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset. The validation includes internal and external validation on the Australian Imaging and Lifestyle flagship study of Ageing (AIBL) and the Open Access Series of Imaging Studies (OASIS).DL and ML models reached similar classification performances. Regarding the brain regions, both types focus on different regions. The ML models focus on the inferior and middle temporal gyri, and the hippocampus, and amygdala regions previously associated with AD. The DL models focus on a wider range of regions including the optical chiasm, the entorhinal cortices, the left and right vessels, and the 4th ventricle which were partially associated with AD. One explanation for the differences is the input features (textures vs. volumes). Both types show reasonable similarity to a ground truth Voxel-Based Morphometry (VBM) analysis. Slightly higher similarities were measured for ML models.
Read full abstract