Abstract

Objective: To explore the value of CT radiomics quantitative features in the prediction of epidermal growth factor receptor (EGFR) mutation in lung cancer. Methods: The data of 144 patients, 75 males, 69 females, median age 54 (25-68 years), with EGFR gene test results in lung cancers diagnosed in the First Affiliated Hospital of Soochow University were retrospectively analyzed, including 81 patients, 39 males, 42 females, median age 52 (25-64)years old, with EGFR mutations and 63 patients,36 males,27 females,median age 56(32-68) years old,with EGFR wild types. According to a ratio of 2︰1, patients were randomly assigned to the training group and validation group. MaZda software was used to extract radiomics features including the gray level histogram (GLH), absolute gradient (GRA), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), auto-regressive model (ARM) and wavelets transform (WAV), and so on. Fisher coefficients (Fisher), classification error probability combined average correlation coefficients (POE+ACC) and mutual information (MI) were used to select 10 optimal features making up the optimal feature subsets. The optimal feature subsets were analyzed by using linear discriminant analysis (LDA) and nonlinear discriminant analysis (NDA) to calculate the accuracy, sensitivity and specificity in the differential diagnosis of EGFR mutant types and wild types in lung cancers. The prediction model was established using the optimal feature subsets with the highest accuracy in the training group with artificial neural network (ANN). The established prediction model was used to differentiate EGFR mutant types from wild types in the validation group. Results: MaZda software extracted a total of 301 quantitative features in the CT images for the patients with EGFR mutant types and EGFR wild types in the training group. The optimal feature subsets obtained from Fisher-NDA and (POE+ACC)-NDA had the highest accuracy of 93.8%, in the differential diagnosis of the EGFR mutant types and EGFR wild types of lung cancer in the training group. The optimal feature subset prediction model obtained from Fisher-NDA had the accuracy, sensitivity and specificity of 83.3%, 86.7% and 77.8%, respectively, in the differential diagnosis of the EGFR mutant types and EGFR wild types of lung cancer in the validation group. Conclusion: The optimal subset of CT radiomics features has high accuracy in predicting EGFR mutations in lung cancer, providing a new method for predicting gene expression of lung cancer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call