Abstract Background Accurate echocardiographic assessment of left ventricular (LV) function is dependent on both image acquisition and analysis. Although 3D echocardiography (3DE) is increasingly recommended for volume quantification, manually derived volumes are limited by operator subjectivity, variable image quality, and the lengthy duration of analysis. Consequently, several automated approaches based on artificial intelligence (AI) have been proposed to segment the LV cavity from 3DE and have demonstrated better accuracy and reproducibility compared to human experts. While the utility of AI models in the clinic depends on their ability to generalise to different populations and ultrasound vendors, the external validation of developed models remains limited. Purpose This study evaluates the ability of an existing AI model trained using data from a single vendor to accurately estimate routine 3DE LV indices in an independent external dataset acquired using equipment from a different vendor. Methods A previously published and internally validated deep learning model for segmentation of the LV cavity and myocardium from 3DE was used to automatically derive LV indices including end-diastolic volume (EDV), end-systolic volume (ESV), LV mass, and ejection fraction (EF) [1]. A second independent dataset consisting of paired 3DE and CMR images from 65 marathon runners from a different centre (using equipment from a different vendor) was subsequently used for external validation. Images from both datasets were subjectively scored from 1 (poor) to 5 (excellent) as a measure of image quality. Agreement with CMR as the reference was assessed using a two-way mixed effects intraclass correlation coefficient (ICC), supplemented by Bland-Altman analysis, to determine the model bias (between 3DE and CMR) and 95% limits of agreement (LOA). To assess whether performance could be improved by inclusion of multi-vendor data during training, the initial model was fine-tuned using a small subset (n=9) of images from the second centre. Results The mean image quality score from the second dataset was 2.5 (± 1.1), which was significantly lower than the mean of 3.4 (± 1.0) from the first dataset. Despite vendor and image quality differences, the initial model produced good agreement with CMR in EDV, ESV, and LV mass but only poor agreement in EF (Fig 1). Fine tuning greatly improved agreement in EF with reduced bias and narrower 95% LOA (Fig 2). Conclusion With the addition of a small number of cases for fine-tuning, AI models for automated LV analysis can achieve good performance analysing 3DE images acquired from a different vendor and population. Advancing model generalisability through more sophisticated data augmentation strategies may enable truly vendor-agnostic automated LV analysis from 3DE.
Read full abstract