Abstract

BackgroundCancer prognosis-related signatures have traditionally been constructed based on gene expression profiles derived from tumor or normal tissues. However, the potential benefits of incorporating gene expression profiles from both tumor and normal tissues to improve signature performance have not been explored. MethodsIn this study, we developed three prognostic models for lung adenocarcinoma (LUAD) using gene expression profiles from tumor tissues, normal tissues, and a combination (COM) of both, sourced from The Cancer Genome Atlas (TCGA). To ensure comparability, the same workflow was followed for all three models. ResultsWhen applied to the TCGA LUAD dataset, the tumor-derived model exhibited the best overall performance, except in calibration analysis, where the normal-derived model performed better. The COM-derived model demonstrated intermediate performance. Validation on three independent test datasets revealed that the COM-derived model showed the best performance, while the normal-derived model showed the worst. In overall survival (OS) analysis, the low-risk group defined by the COM-derived model consistently exhibited longer mean survival times. The tumor-derived model did not consistently show this trend, and the normal-derived model produced opposite results. In discrimination analysis, no significant differences were observed. The COM-derived model demonstrated good discrimination ability for short periods, while the tumor-derived model performed better for longer periods. In calibration analysis, both the COM and tumor-derived models had similar absolute prediction errors, which were better than those of the normal-derived model. However, the tumor-derived model tended to underestimate survival rates. The clinical feature analysis and validation in GSE229705 indicate that the risk score (RS) from the COM model is the most clinically significant. These results demonstrate that the COM model's RS aligns more closely with clinical data, maintaining stable performance and the strongest generalizability. ConclusionsOverall, the COM-derived model demonstrated the best generalization ability. The superior performance of the tumor-derived model in the TCGA LUAD dataset might be due to overfitting. Our results suggest that appropriate combinations of gene expression data from tumor and normal tissues can enhance the predictive power of prognostic signatures.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.