Abstract Background Surveillance of aortic dimensions requires reproducible measurements, and knowledge of the rate of progression in those with dilatation. Purpose To develop an open machine-learning method for measuring the aortic root and proximal ascending aorta on echocardiograms, validate it through a multi-expert panel from the Unity UK Echocardiography AI Collaborative, and applying the AI to derive the rate of progression in historical databases. Methods The neural network was trained on 1478 parasternal long-axis images. Each image was labelled with key points for the aortic annulus, sinus and sinotubular junction, and proximal ascending aorta dimensions. Labels accommodated inner-to-inner and leading-to-leading conventions. For validation, the end-diastolic and mid-systolic images were identified (accommodating different guidelines recommendations) from 100 studies, and 10 expert echocardiographers made the aortic measurements. The consensus of experts defined the reference standard, and the variation between individual experts defined the acceptable variation for the AI. The AI was then applied to 724 echocardiograms of 102 patients under surveillance spanning 12 years (ranging from 4 to 17 scans per patient) without intervention. Training data labels and networks are made freely available on our project website. Results The median absolute deviations between the AI and the expert consensus ranged from 0.06cm to 0.17cm across the 16 measurements. For the individual expert opinions this was from 0.08cm to 0.19cm. For 7/16 measurements, there was no statistically significant difference between the AI deviation and the individual-expert deviation. The AI’s confidence level was a useful indicator of the reliability of the AI measure. For the top 9 deciles of AI confidence level, the AI measurements had significantly smaller deviations than the individual experts. The rates of progression averaged across the three aortic measurements was 2.0 mm/decade (95% CI 1.0 to 3.0; p =0.007). Conclusion Expert consensus can be used to both define the reference standard and also the range of acceptable deviation from consensus. Overall, the AI performs similarly to experts and more importantly, provides an automated estimate of confidence. The validated AI is consistent over time, and when applied retrospectively to patients with dilated aortic roots, found the rate of progression was more rapid than in previously reported healthy patients. This approach could be applied generally in developing medical AI to not only make reproducible measurements but to automate large scale longitudinal studies with little input, and ultimately serves as a useful research and clinical tool.Figure: AI(red) vs Expert (blue)Table: AIDeviations from Consensus