Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

Ryoungwoo Jang,Miso Jang,Kyung Hwa Lee,Sang Min Lee,Joon Beom Seo,Han Na Noh,Kyung Hee Lee,Namkug Kim

doi:10.2196/18089

Ryoungwoo Jang, Miso Jang + Show 6 more

Open Access

https://doi.org/10.2196/18089

Copy DOI

Journal: JMIR Medical Informatics	Publication Date: Aug 4, 2020
Citations: 17	License type: cc-by

Affiliation: University of Ulsan

Abstract

BackgroundComputer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors.ObjectiveThis study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling.MethodsWe trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Diseases of each chest x-ray in our dataset were confirmed by a thoracic radiologist using computed tomography (CT). Receiver operating characteristic (ROC) and area under the curve (AUC) were evaluated in each test. Randomly chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist.ResultsIn comparison with the public datasets of NIH and CheXpert, where AUCs did not significantly drop to 16%, the AUC of the AMC-SNUBH dataset significantly decreased from 2% label noise. Evaluation of the public datasets by 3 physicians and 1 thoracic radiologist showed an accuracy of 65%-80%.ConclusionsThe deep learning–based computer-aided diagnosis model is sensitive to label noise, and computer-aided diagnosis with inaccurate labels is not credible. Furthermore, open datasets such as NIH and CheXpert need to be distilled before being used for deep learning–based computer-aided diagnosis.

Highlights

Posteroanterior chest x-ray (CXR) is one of the most widely used methods to evaluate a subject’s chest
The aim of this study is threefold: (1) to train computed tomography (CT)-confirmed CXR datasets from Asan Medical Center (AMC) and Seoul National University Bundang Hospital (SNUBH), which can be considered clean with an intentionally given label noise of 0%, 1%, 2%, 4%, 8%, 16%, and 32%; (2) to train National Institutes of Health (NIH) and CheXpert datasets, which are considered noisy with an intentionally given label noise of 0%, 1%, 2%, 4%, 8%, 16%, and 32%; and (3) to have the NIH and CheXpert datasets re-evaluated by 3 physicians and one radiologist
The results of our dataset reveal that the convolutional neural networks (CNNs) architecture is extremely sensitive to label noise

Summary

Introduction

Posteroanterior chest x-ray (CXR) is one of the most widely used methods to evaluate a subject’s chest. Among the various types of deep learning algorithms, the convolutional neural network (CNN) is the most widely used technique for CXR classification. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset These datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors. Objective: This study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling. Methods: We trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist. Open datasets such as NIH and CheXpert need to be distilled before being used for deep learning–based computer-aided diagnosis

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: JMIR Medical Informatics

Lead the way for us

Similar Papers

Explainable COVID-19 Detection Based on Chest X-rays Using an End-to-End RegNet Architecture.
Mohamed Chetoui ... Stephane Roux
Viruses | VOL. 15
Mohamed Chetoui, et. al.Mohamed Chetoui ... Stephane Roux
06 Jun 2023
Viruses | VOL. 15

Classification of Bacterial and Viral Childhood Pneumonia Using Deep Learning in Chest Radiography
Xianghong Gu ... Ran Yang
-
Xianghong Gu, et. al.Xianghong Gu ... Ran Yang
16 Mar 2018
16 Mar 2018

Computer-aided COVID-19 diagnosis and a comparison of deep learners using augmented CXRs.
Asma Naseer ... Arifah Azhar
Journal of X-ray science and technology | VOL. 30
Asma Naseer, et. al.Asma Naseer ... Arifah Azhar
22 Jan 2022
Journal of X-ray science and technology | VOL. 30

Dealing with Robustness of Convolutional Neural Networks for Image Classification
Paolo Arcaini ... Andrea Bombarda
-
Paolo Arcaini, et. al.Paolo Arcaini ... Andrea Bombarda
01 Aug 2020
01 Aug 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: JMIR Medical Informatics