Abstract

The communication of reliable uncertainty estimates is crucial in the effort towards increasing trust in Deep Learning applications for medical image analysis. Importantly, reliable uncertainty estimates should remain stable under naturally occurring domain shifts. In this study, we evaluate the relationship between epistemic uncertainty and segmentation quality under domain shift within two clinical contexts: optic disc segmentation in retinal photographs and brain tumor segmentation from multi-modal brain MRI. Specifically, we assess the behavior of two epistemic uncertainty metrics derived from i, a single UNet’s sigmoid predictions, ii, deep ensembles, and iii, Monte Carlo dropout UNets, each trained with both soft Dice and weighted cross-entropy loss. Domain shifts were modeled by excluding a group with a known characteristic (glaucoma for optic disc segmentation and low-grade glioma for brain tumor segmentation) from model development and using the excluded data as additional, domain-shifted test data. While the performance of all models dropped slightly on the domain-shifted test data compared to the in-domain test set, there was no change in the Pearson correlation coefficient between the uncertainty metrics and the Dice scores of the segmentations. However, we did observe differences in the performance of two quality assessment applications based on epistemic uncertainty between the segmentation tasks. We introduce a new metric, the empirical strength distribution, to better describe the strength of the relationship between segmentation performance and epistemic uncertainty on a dataset level. We found that failures of the studied quality assessment applications were largely caused by shifts in the empirical strength distributions between training, in-domain, and domain-shifted test datasets. In conclusion, quality assessment tools based on the strong relationship between epistemic uncertainty and segmentation quality can be stable under small domain shifts. Developers should thoroughly evaluate the strength relationships for all available data and, if possible, under domain shift to ensure the validity of these uncertainty estimates on unseen data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.