Abstract

In the area of medical imaging, one of the factors that can negatively influence the performance of prediction algorithms is the limited number of observations for each class within a labeled dataset. Usually, in order to increase the samples, a second set of unlabeled images is used. However, this set adds two new problems (i) finding patient observations with different pathologies than those observed in the labeled data set and (ii) finding images belonging to a different distribution from the dataset used in the model training process. This way, merging datasets from different sources can have an adverse effect on the distribution of features. Encountering this type of data (better known as out-of-distribution data) within the deployment environments may also lead to varying degrees of performance degradation as can be seen in the different experimental results obtained. In this research, a study of the behavior of Feature Density is made, as a mathematical model for the estimation of predictive uncertainty in supervised classification algorithms, in order to improve the behavior when out-of-distribution data are presented in the dataset. The Feature Density method is based on the estimation of feature density by means of histogram calculation (or Probability Density Function). The advantage of this method over the baseline approach (Mahalanobis distance) is that it does not assume a Gaussian-type distribution of sample characteristics and serves to estimate the uncertainty. This work focuses on the binary classification of mammography X-ray images from three different datasets simulating the condition of a different degree of contamination with out-of-distribution sample. According to the obtained results, the performance of the proposed method depends directly on the architecture of the implemented neural network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call