Transcriptional profiling of lung adenocarcinomas has identified numerous gene expression phenotype (GEP) and risk prediction (RP) signatures associated with patient outcome. However, classification agreement between signatures, underlying transcriptional programs, and independent signature validation are less studied. Published transcriptional profiles from 17 cohorts comprising 2395 lung adenocarcinomas with available patient outcome data were collected from authors’ websites or public repositories as described in the original studies. Tumors were classified according to 18 different GEPs or RPs derived from microarray analysis of lung adenocarcinoma or NSCLC cohorts, using reported original classification schemes or consensus clustering of gene signatures on a per cohort basis. To independently assess the connection between classifications by the lung cancer derived signatures and classification by expression of proliferation-related genes only, a 155-gene breast cancer proliferation signature was used to classify all tumors as low-proliferative (average expression of the 155-genes < median) or high-proliferative. Tumors were also scored according to five reported expression metagenes in lung cancer representing different biological processes; proliferation, immune response, basal/squamous, stroma/extra cellular matrix (ECM), and expression of Napsin A/ surfactants on a per cohort basis to contrast subtype classifications with biological functions. 16 out of 18 signatures were associated with patient survival in the total cohort and in multiple individual cohorts. For significant signatures, total cohort hazard ratios were ∼2 in univariate analyses (mean=1.95, range=1.4-2.6). Signatures derived in adenocarcinoma generally displayed better classification agreement than signatures derived in mixed NSCLC cohorts. Strong classification agreement between signatures was observed, especially for predicted low-risk patients by adenocarcinoma-derived signatures, despite a generally low gene overlap. Expression of proliferation-related genes correlated strongly with GEP subtype classifications and RP scores, driving the gene signature association with prognosis. A three-group consensus definition of samples across 10 GEP classifiers demonstrated aggregation of samples with specific smoking patterns, gender, and EGFR/KRAS mutations, while survival differences were only significant when patients were divided into low- or high-risk resembling a terminal respiratory (TRU) like and non-TRU division, respectively. By providing this consensus of current GEPs and RPs in lung adenocarcinoma we have connected molecular phenotypes, risk predictions, patient outcome and underlying transcriptional programs of the different classifier types. Our results provide a general insight into the nature and agreement of GEP and RP signatures in the disease, and their prognostic value.
Read full abstract