In patients with head and neck cancer (HNC), lymph node (N) metastases are associated with cancer aggressiveness and poor prognosis. Identifying meaningful gene modules and representative biomarkers relevant to the N stage helps predict prognosis and reveal mechanisms underlying tumor progression. The present study used a step-wise approach for weighted gene co-expression network analysis (WGCNA). Dataset GSE65858 was subjected to WGCNA. RNA sequencing data of HNC downloaded from the Cancer Genome Atlas (TCGA) and dataset GSE39366 were utilized to validate the results. Following data preprocessing, 4,295 genes were screened, and blue and black modules associated with the N stage of HNC were identified. A total of 16 genes [keratinocyte differentiation associated protein, suprabasin, cornifelin (CNFN), small proline rich protein 1B, desmoglein 1 (DSG1), chromosome 10 open reading frame 99, keratin 16 pseudogene 3, gap junction protein β2, dermokine, LY6/PLAUR domain containing 3, transmembrane protein 79, phospholipase A2 group IVE, transglutaminase 5, potassium two pore domain channel subfamily K member 6, involucrin, kallikrein related peptidase 8] that had a negative association with the N-stage in the blue module, and two genes (structural maintenance of chromosomes 4 and mutS homolog 6) that had a positive association in the black module, were identified to be candidate hub genes. Following further validation in TCGA and dataset GSE65858, it was identified that CNFN and DSG1 were associated with the clinical stage of HNC. Survival analysis of CNFN and DSG1 was subsequently performed. Patients with increased expression of CNFN displayed better survival probability in dataset GSE65858 and TCGA. Therefore, CNFN was selected as the hub gene for further verification in the Gene Expression Profiling Interactive Analysis database. Finally, functional enrichment and gene set enrichment analyses were performed using datasets GSE65858 and GSE39366. Three gene sets, namely ‘P53 pathway’, ‘estrogen response early’ and ‘estrogen response late’, were enriched in the two datasets. In conclusion, CNFN, identified via the WGCNA algorithm, may contribute to the prediction of lymph node metastases and prognosis, probably by regulating the pathways associated with P53, and the early and late estrogen response.
Read full abstract