Abstract
The identification of prognostic genes can help in the clinical management of non-small cell lung cancer (NSCLC). However, there is little overlap in the prognostic genes identified in different NSCLC studies. One reason for this may be the inadequate sample size. Here, the effect of sample size on prognostic genes analysis was investigated based on 515 stage II/III NSCLC cases from two cohorts detected by whole-exome sequencing. Prognostic genes analysis was repeatedly performed 100 times for each sample size level using random resampling methods. In stage II lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cases from the TCGA Pan-Lung Cancer cohort, the number of statistically significant prognostic genes first increased with sample size in a power law, then fluctuated steadily, and finally decreased slightly. The power law growth curves were also observed in stage III LUAD and LUSC cases from the TCGA Pan-Lung Cancer cohort and stage III Chinese LUAD cases from the OncoSG cohort. The correlation R2 of the fitted power law growth curves were all greater than 0.99. In addition, at the sample size level where the number of prognostic genes peaked, the mean proportion of true prognostic genes in patients with stage II LUAD and LUSC was 28.32% and 23.12%, which could partly explain the little overlap in prognostic genes between reports. In conclusion, the number of prognostic genes takes a power law growth with the sample size in NSCLC, independent of histopathological subtype, race, and stage. These results also show how sample size affects the reliability of prognostic genes and will aid trial design for genomic mutation-based prognostic studies in NSCLC.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.