Abstract

Congenital heart disease (CHD) is the leading cause of infant death. An artificial intelligence (AI)-based CHD diagnosis network (CHDNet) is an echocardiogram video-based binary classification model that judges whether echocardiogram videos contain heart defects. Existing CHDNets have shown performances comparable to or even better than medical experts, but their unreliability on cases outside of the training set has become the main bottleneck for their deployment. This is a common problem for most AI-based diagnostic approaches. Here, to overcome this challenge, we present two essential mechanisms—Bayesian inference and dynamic neural feedback—to respectively measure and improve the diagnostic reliability of AI. The former easily makes the neural network output its reliability instead of a single prediction result, while the latter is a computational neural feedback cell that allows the neural network to feed knowledge from the output layer back to the shallow layers and enables the neural network to selectively activate relevant neurons. To evaluate the effectiveness of these two mechanisms, we trained CHDNets on 4151 echocardiogram videos containing three common CHD defects and tested them on an internal test set of 1037 echocardiogram videos and an external set of 692 videos that were newly collected from other cardiovascular imaging devices. Each echocardiogram video corresponds to a unique patient and a unique visit. We demonstrate on various neural network architectures how the reliability obtained by Bayesian inference interprets and quantifies the significant performance difference between internal and external test sets of neural networks, and how the devised feedback cell helps the neural networks to maintain high accuracy and reliability, despite the input being corrupted by noise or when using an external test set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call