Abstract

The potential of deep learning to advance lung nodule detection from chest X-rays is significantly compromised by the lack of large annotated databases and noisy labels in the existing databases. The aim of this study is to investigate the applicability of the novel Confident Learning approach for chest X-ray database cleaning and nodule detection improving. We took a subset of the NIH Chest X-ray Dataset of 14 Common Thorax Disease Categories that contains only chest X-ray images with the presence of nodules and the same amount of chest X-ray images of healthy lungs. Next, we split the obtained dataset into train and test sets. In turn, the train set was split into 4-folds to train models using a cross-validation procedure. After that, we trained an Xception (Convolutional Neural Network) model for each fold to classify chest X-ray images with nodules. We calculated probabilities for the whole train set using a cross-validation approach and evaluated the performance of trained models on the test set. To obtain noisy labels, we have to apply a family of theory and algorithms called Confident Learning with provable guarantees of exact noise estimation and label error finding. The algorithm takes noisy labels and predicted probabilities as input and returns found label errors ordered by the likelihood of being an error. We took 5% of the noisiest samples from the list provided by the algorithm and eliminated them from our train set. Then, we repeated the training pipeline but using the clean version of our train set and evaluated it on the test set. Originally, our classification pipeline gives an accuracy of 71.49%, while after applying the Confident Learning to prune noisy samples, we improved the accuracy to 72.4%. We also brought in a professional radiologist to interpret found label errors by the Confident Learning algorithm. We provided our radiologist with 100 clean chest X-ray images and asked him to classify them. In most cases, radiologist results and known dataset labels for clean X-ray images matched (72%: 36 TP, 36 TN, 18 FP, 10 FN). In this case, false-positive and false-negative predictions by the radiologist can be explained by the fact that the original dataset contains many instances where several pathologies are present in the radiogram at the same time and the radiologist most likely referred these instances to another pathology (not nodular formations). In the next experiment, we gave radiologist 100 noisy chest X-ray images found by the Confident Learning algorithm. It turned out that results obtained from the radiologist and known dataset labels for these noisy X-ray images were very different (only 39% matched: 19 TP, 20 TN, 34 FP, 27 FN). Our experiments showed that cleaning the datasets can improve the performance of deep learning algorithms on the example of detection of X-rays with lung nodules. Experiments with a radiologist showed that noisy samples are most likely incorrectly labeled or contain atypical cases of nodular formations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call