Lung Nodule Classification Using Biomarkers, Volumetric Radiomics, and 3D CNNs

Sumeet Menon,Phuong Nguyen,Jayalakshmi Mangalagiri,David R Chapman,Kushal Mehta,Arshita Jain

doi:10.1007/s10278-020-00417-y

Sumeet Menon, Phuong Nguyen + Show 4 more

Open Access

https://doi.org/10.1007/s10278-020-00417-y

Copy DOI

Abstract

We present a hybrid algorithm to estimate lung nodule malignancy that combines imaging biomarkers from Radiologist’s annotation with image classification of CT scans. Our algorithm employs a 3D Convolutional Neural Network (CNN) as well as a Random Forest in order to combine CT imagery with biomarker annotation and volumetric radiomic features. We analyze and compare the performance of the algorithm using only imagery, only biomarkers, combined imagery + biomarkers, combined imagery + volumetric radiomic features, and finally the combination of imagery + biomarkers + volumetric features in order to classify the suspicion level of nodule malignancy. The National Cancer Institute (NCI) Lung Image Database Consortium (LIDC) IDRI dataset is used to train and evaluate the classification task. We show that the incorporation of semi-supervised learning by means of K-Nearest-Neighbors (KNN) can increase the available training sample size of the LIDC-IDRI, thereby further improving the accuracy of malignancy estimation of most of the models tested although there is no significant improvement with the use of KNN semi-supervised learning if image classification with CNNs and volumetric features is combined with descriptive biomarkers. Unexpectedly, we also show that a model using image biomarkers alone is more accurate than one that combines biomarkers with volumetric radiomics, 3D CNNs, and semi-supervised learning. We discuss the possibility that this result may be influenced by cognitive bias in LIDC-IDRI because malignancy estimates were recorded by the same radiologist panel as biomarkers, as well as future work to incorporate pathology information over a subset of study participants.

Highlights

Lung cancer accounts for the highest number of cancer related deaths globally, but early detection can improve prognosis
Our goal is to develop a hybrid Computer-Aided Diagnosis (CAD) algorithm that combines Convolutional Neural Network (CNN)-based image classification, with volumetric Radiomics, as well as descriptive biomarkers from Radiologists annotation
We present an algorithm for classifying the lung nodule malignancy suspicion level as either being malignant or benign where malignant means that the nodule is highly suspicious and benign being that the nodule is highly not suspicious

Summary

Introduction

Lung cancer accounts for the highest number of cancer related deaths globally, but early detection can improve prognosis. Lung cancer screening using low-dose computed tomography (LDCT) has become standard practice as a way of determining which pulmonary nodules are likely benign and which nodules require biopsy to determine malignancy. Lung cancer screening has a high false positive rate clinically due to the need to identify a large percentage of malignant nodules for biopsy. Our goal is to develop a hybrid Computer-Aided Diagnosis (CAD) algorithm that combines Convolutional Neural Network (CNN)-based image classification, with volumetric Radiomics, as well as descriptive biomarkers from Radiologists annotation. We evaluate the extent to which descriptive biomarkers can be used for the purposes of semisupervised learning in order to help to reduce this false positive rate

Objectives

Results

Discussion

Conclusion