Background Echocardiography is the test of choice to assess cardiac function to diagnose and manage heart disease. However, manual interpretation of the echocardiogram is time-consuming and subject to human error. Therefore, we demonstrate a fully automated deep learning workflow to classify, segment, and annotate 2D videos and Doppler modalities. Methods The workflow was developed using a total of 20,828 annotated 2D and Doppler modalities images from 1,145 studies (N) from a heart failure research platform in Singapore. We validated the workflow against human measurements in a dataset from Canada (N=1,029), a 'real-world' dataset from Taiwan (N=31,241), the US-based EchoNet-Dynamic dataset (N=10,030); and in an independent prospective Singapore dataset (N=29 to 66) with repeated human expert measurements. Results In the test set, the automated workflow classified 2D videos and Doppler modalities with accuracies ranging from 0.91 to 0.99. The left ventricle and left atrium segmentation were accurate, with a mean Dice similarity coefficient >93% for all. In the external datasets, automated measurements showed good agreement with locally measured values, with a mean absolute error range of 13-25ml for left ventricular volumes, 6-10% for left ventricular ejection fraction, and 1.8-2.0 for E/e'; and reliably classified systolic dysfunction (LVEF<40%, area under the curve [AUC] range 0.90-0.92, Figure A) and diastolic dysfunction (E/e' ≥13, AUC range 0.91-0.91, Figure B). Independent prospective evaluation confirmed similar or less variance of automated compared to human expert measurements. Conclusions : Deep learning algorithms can automatically annotate 2D videos and Doppler modalities with similar accuracy compared with human measurements. Echocardiography is the test of choice to assess cardiac function to diagnose and manage heart disease. However, manual interpretation of the echocardiogram is time-consuming and subject to human error. Therefore, we demonstrate a fully automated deep learning workflow to classify, segment, and annotate 2D videos and Doppler modalities. The workflow was developed using a total of 20,828 annotated 2D and Doppler modalities images from 1,145 studies (N) from a heart failure research platform in Singapore. We validated the workflow against human measurements in a dataset from Canada (N=1,029), a 'real-world' dataset from Taiwan (N=31,241), the US-based EchoNet-Dynamic dataset (N=10,030); and in an independent prospective Singapore dataset (N=29 to 66) with repeated human expert measurements. In the test set, the automated workflow classified 2D videos and Doppler modalities with accuracies ranging from 0.91 to 0.99. The left ventricle and left atrium segmentation were accurate, with a mean Dice similarity coefficient >93% for all. In the external datasets, automated measurements showed good agreement with locally measured values, with a mean absolute error range of 13-25ml for left ventricular volumes, 6-10% for left ventricular ejection fraction, and 1.8-2.0 for E/e'; and reliably classified systolic dysfunction (LVEF<40%, area under the curve [AUC] range 0.90-0.92, Figure A) and diastolic dysfunction (E/e' ≥13, AUC range 0.91-0.91, Figure B). Independent prospective evaluation confirmed similar or less variance of automated compared to human expert measurements. : Deep learning algorithms can automatically annotate 2D videos and Doppler modalities with similar accuracy compared with human measurements.
Read full abstract