Establishing the safety of a smart heart health monitoring service through validation

Murtadha Kareem,Oliver Faust

doi:10.1109/bigdata47090.2019.9006478

Abstract

In this study we discuss how validation can help to establish the safety of a mature Long Short-Term Memory (LSTM) deep learning algorithm for Atrial Fibrillation (AF) detection. We argue that safety is linked to understanding and testing the deep learning model. We put forward a test scenario for a computer aided diagnosis which incorporates the deep learning algorithm. To be specific, we mimic a situation where the system is tasked with diagnosing a new patient. Studying this model gives us the opportunity to determine how many false positives the system will produce i.e. how many normal subjects were diagnosed as AF. Avoiding false positives is an important safety aspect, because treatment, such as anticoagulation, carries the risk of death. Therefore, preventing false positives plays a significant role for the safety of a diagnosis support system. To establish the false positive performance of our mature LSTM deep learning system, we validated it with normal heart rate data (HR) from the Physionet’s fantasia database. None of the fantasia HR traces was used during the deep learning model design. Furthermore, the test data was measured with a different measurement setup than the data that was used during the deep learning training. Hence, the fantasia data is completely unknown to the deep learning algorithm. Therefore, the algorithm has to rely on the knowledge to differentiate AF and non-AF HR traces which was extracted during the learning phase. The fact that we tested the algorithm with data from normal subjects helps us to quantify the deep learning algorithm performance in terms of avoiding false positive classifications. The mature LSTM deep learning algorithm achieved a false positive rate of 0.024. From a safety perspective, these results could be improved even further by biasing the classification results towards false negative. However, we propose to reduce the false positive rate with a hybrid diagnosis process where the deep learning algorithm works cooperatively with a human cardiologist. The machine algorithm will analyze all HR traces in real time and the human practitioner will be notified if suspicious beats were detected. Once notified, the human expert can fuse the objective deep learning results with prior knowledge about the patient to reach a safe and reliable diagnosis.

Full Text