AbstractBackgroundImage‐based machine learning models are promising to aid diagnosis in dementia, e.g. by diagnostic classification or estimating cognitive performance. However, the contribution of vascular pathology is often ignored, although many patients have mixed pathologies. This may hamper model generalizability from research setting to clinical practice, as the models are generally developed on populations with lower vascular burden than is encountered clinically. This study aims to evaluate whether model generalizability is influenced by differences in vascular pathology load between training and test groups, represented by white matter hyperintensities (WMHs).MethodWe developed a convolutional neural network to estimate cognitive performance through ADAS13. We included 4791 timepoints from 1632 ADNI subjects. To mimic differences in vascular burden between research and clinical populations, we split the data into a low, middle and high WMH load group using WMH ratio tertiles (WMH volume corrected for intracranial volume). The model was applied to maps derived from T1‐weighted and FLAIR brain MRI using voxel‐based morphometry: model 1 (gray matter density map (GMDM), age, sex), model 2 (GMDM, WMH density map (WMHDM), age, sex) and model 3 (GMDM, log(WMH ratio), age, sex) (Fig. 1). First, we trained all models on the low WMH load group, meaning to test on the high WMH load group to assess generalizability. Second, we trained models 1 and 2 on all data. Mean Absolute Error (MAE) served as the performance metric.ResultThe models trained on the low WMH load group (MAE = 5.08 – 5.41) performed similar to constant value prediction (MAE = 5.63) (Fig. 2, circles). Using all data, models did train successfully (model 1: MAE = 5.64, model 2: MAE = 6.31, constant value: MAE = 7.39) (Fig. 2, crosses). Because of the suboptimal model training in the low WMH load group, especially for individuals with higher ADAS13 scores which were underrepresented in the data (Fig. 3), we could not evaluate generalizability to the high WMH load group.ConclusionThis study was unable to provide new insight into the influence of vascular pathology on the generalizability of machine learning models in dementia. More clinically representative data is needed, covering the full cognitive performance spectrum and higher vascular pathology loads.