Back to Search Start Over

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers

Authors :
Joon Beom Seo
Miso Jang
Namkug Kim
Kyung Hee Lee
Ryoungwoo Jang
Sang Min Lee
Kyung Hwa Lee
Han Na Noh
Source :
JMIR Medical Informatics, JMIR Medical Informatics, Vol 8, Iss 8, p e18089 (2020)
Publication Year :
2020
Publisher :
JMIR Publications, 2020.

Abstract

BackgroundComputer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors.ObjectiveThis study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling.MethodsWe trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Diseases of each chest x-ray in our dataset were confirmed by a thoracic radiologist using computed tomography (CT). Receiver operating characteristic (ROC) and area under the curve (AUC) were evaluated in each test. Randomly chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist.ResultsIn comparison with the public datasets of NIH and CheXpert, where AUCs did not significantly drop to 16%, the AUC of the AMC-SNUBH dataset significantly decreased from 2% label noise. Evaluation of the public datasets by 3 physicians and 1 thoracic radiologist showed an accuracy of 65%-80%.ConclusionsThe deep learning–based computer-aided diagnosis model is sensitive to label noise, and computer-aided diagnosis with inaccurate labels is not credible. Furthermore, open datasets such as NIH and CheXpert need to be distilled before being used for deep learning–based computer-aided diagnosis.

Details

Language :
English
ISSN :
22919694
Volume :
8
Issue :
8
Database :
OpenAIRE
Journal :
JMIR Medical Informatics
Accession number :
edsair.doi.dedup.....3ffa030ee4540ae59b7407c322dfb8ed