Abstract
BackgroundLung adenocarcinoma is the most common type of lung cancers. Whole-genome sequencing studies disclosed the genomic landscape of lung adenocarcinomas. however, it remains unclear if the genetic alternations could guide prognosis prediction. Effective genetic markers and their based prediction models are also at a lack for prognosis evaluation.MethodsWe obtained the somatic mutation data and clinical data for 371 lung adenocarcinoma cases from The Cancer Genome Atlas. The cases were classified into two prognostic groups (3-year survival), and a comparison was performed between the groups for the somatic mutation frequencies of genes, followed by development of computational models to discrete the different prognosis.ResultsGenes were found with higher mutation rates in good (≥ 3-year survival) than in poor (< 3-year survival) prognosis group of lung adenocarcinoma patients. Genes participating in cell-cell adhesion and motility were significantly enriched in the top gene list with mutation rate difference between the good and poor prognosis group. Support Vector Machine models with the gene somatic mutation features could well predict prognosis, and the performance improved as feature size increased. An 85-gene model reached an average cross-validated accuracy of 81% and an Area Under the Curve (AUC) of 0.896 for the Receiver Operating Characteristic (ROC) curves. The model also exhibited good inter-stage prognosis prediction performance, with an average AUC of 0.846 for the ROC curves.ConclusionThe prognosis of lung adenocarcinomas is related with somatic gene mutations. The genetic markers could be used for prognosis prediction and furthermore provide guidance for personal medicine.
Highlights
Lung adenocarcinoma is the most common type of lung cancers
Somatic mutation difference between groups with different prognosis Survival analysis was performed to the lung adenocarcinomas (LUADs) cases with both genome sequencing information and clinical follow-up data (Fig. 1a)
To observe the possible association of somatic mutations with LUAD prognosis, gene mutation rate was compared between the two prognostic groups
Summary
Lung adenocarcinoma is the most common type of lung cancers. Whole-genome sequencing studies disclosed the genomic landscape of lung adenocarcinomas. it remains unclear if the genetic alternations could guide prognosis prediction. Effective genetic markers and their based prediction models are at a lack for prognosis evaluation. Yu et al BMC Cancer (2019) 19:263 Molecular markers, such as EGFR, ERCC1, RRM1, BRCA1, RET, etc., have been experimentally identified and tested for prognostic prediction [15,16,17]. It remains difficult to find the most significant genetic features and build a high-effective predictive model for treatment outcomes. We collected the large-scale LUAD case data with both genome and clinic information (n = 371) from TCGA (The Cancer Genome Atlas) (http://cancergenome.nih.gov), analyzed the somatic mutation difference between the two groups categorized based on the 3-year overall survival, and developed a machine learning model to predict prognosis based on the most significant genetic markers. The training datasets and models for the treatment outcome prediction of lung carcinoma are freely accessible through the website: http://www.szu-bioinf.org/CPP/LUADpp
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.