To systematically evaluate which lesion-based imaging features and methods allow for the best statistical prediction of poststroke deficits across independent datasets. We utilized imaging and clinical data from three independent datasets of patients experiencing acute stroke (N1 = 109, N2 = 638, N3 = 794) to statistically predict acute stroke severity (NIHSS) based on lesion volume, lesion location, and structural and functional disconnection with the lesion location using normative connectomes. We found that prediction models trained on small single-center datasets could perform well using within-dataset cross-validation, but results did not generalize to independent datasets (median R2 N1 = 0.2%). Performance across independent datasets improved using large single-center training data (R2 N2 = 15.8%) and improved further using multicenter training data (R2 N3 = 24.4%). These results were consistent across lesion attributes and prediction models. Including either structural or functional disconnection in the models outperformed prediction based on volume or location alone (P < 0.001, FDR-corrected). We conclude that (1) prediction performance in independent datasets of patients with acute stroke cannot be inferred from cross-validated results within a dataset, as performance results obtained via these two methods differed consistently, (2) prediction performance can be improved by training on large and, importantly, multicenter datasets, and (3) structural and functional disconnection allow for improved prediction of acute stroke severity.
Read full abstract