Abstract

Deep neural networks are increasingly being used for computer-aided diagnosis, but erroneous diagnoses can be extremely costly for patients. We propose a learning to defer with uncertainty (LDU) algorithm which identifies patients for whom diagnostic uncertainty is high and defers them for evaluation by human experts. LDU was evaluated on the diagnosis of myocardial infarction (using discharge summaries), the diagnosis of any comorbidities (using structured data), and the diagnosis of pleural effusion and pneumothorax (using chest x-rays), and compared with ‘learning to defer without uncertainty information’ (LD) and ‘direct triage by uncertainty’ (DT) methods. LDU achieved the same F1 score as LD but deferred considerably fewer patients (e.g. 36% vs. 69% deferral rate for diagnosing pleural effusion with an F1 score of 0.96). Furthermore, even when many patients were assigned the wrong diagnosis with high confidence (e.g. for the diagnosis of any comorbidities) LDU achieved a 17% increase in F1 score, whereas DT was not applicable. Importantly, the weight of the defer loss in LDU can be easily adjusted to obtain the desired trade-off between diagnostic accuracy and deferral rate. In conclusion, LDU can readily augment any existing diagnostic network to reduce the risk of erroneous diagnoses in clinical practice.

Highlights

  • The table suggests that the LDU algorithm results in better defer rates than the LD algorithm also when performance metrics other than F1 score are considered

  • This study only considered model uncertainty, and the cost of deferring a patient for human evaluation was assumed to be constant

  • We hypothesize that this is due to the relatively small number of parameters of the underlying diagnostic network and consequent low epistemic uncertainty for the predicted diagnoses ( leading to low diagnostic entropy for the LDU and DT algorithms)

Read more

Summary

Objectives

Our aim is to minimize patients’ risk when machine learning (ML) models are deployed in healthcare settings, by preventing the application of computer-aided diagnoses in groups of patients for whom the expected diagnostic error is large

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call