BackgroundIn Huntington’s disease clinical trials, recruitment and stratification approaches primarily rely on genetic load, cognitive and motor assessment scores. They focus less on in vivo brain imaging markers, which reflect neuropathology well before clinical diagnosis. Machine learning methods offer a degree of sophistication which could significantly improve prognosis and stratification by leveraging multimodal biomarkers from large datasets. Such models specifically tailored to HD gene expansion carriers could further enhance the efficacy of the stratification process. ObjectivesTo improve stratification of Huntington’s disease individuals for clinical trials. MethodsWe used data from 451 gene positive individuals with Huntington’s disease (both premanifest and diagnosed) from previously published cohorts (PREDICT, TRACK, TrackON, and IMAGE). We applied whole-brain parcellation to longitudinal brain scans and measured the rate of lateral ventricular enlargement, over 3 years, which was used as the target variable for our prognostic random forest regression models. The models were trained on various combinations of features at baseline, including genetic load, cognitive and motor assessment score biomarkers, as well as brain imaging-derived features. Furthermore, a simplified stratification model was developed to classify individuals into two homogenous groups (low risk and high risk) based on their anticipated rate of ventricular enlargement. ResultsThe predictive accuracy of the prognostic models substantially improved by integrating brain imaging features alongside genetic load, cognitive and motor biomarkers: a 24 % reduction in the cross-validated mean absolute error, yielding an error of 530 mm3/year. The stratification model had a cross-validated accuracy of 81 % in differentiating between moderate and fast progressors (precision = 83 %, recall = 80 %). ConclusionsThis study validated the effectiveness of machine learning in differentiating between low- and high-risk individuals based on the rate of ventricular enlargement. The models were exclusively trained using features from HD individuals, which offers a more disease-specific, simplified, and accurate approach for prognostic enrichment compared to relying on features extracted from healthy control groups, as done in previous studies. The proposed method has the potential to enhance clinical utility by: i) enabling more targeted recruitment of individuals for clinical trials, ii) improving post-hoc evaluation of individuals, and iii) ultimately leading to better outcomes for individuals through personalized treatment selection.