Abstract

International initiatives such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) are collecting multiple datasets at different genome-scales with the aim of identifying novel cancer biomarkers and predicting survival of patients. To analyze such data, several statistical methods have been applied, among them Cox regression models. Although these models provide a good statistical framework to analyze omic data, there is still a lack of studies that illustrate advantages and drawbacks in integrating biological information and selecting groups of biomarkers. In fact, classical Cox regression algorithms focus on the selection of a single biomarker, without taking into account the strong correlation between genes. Even though network-based Cox regression algorithms overcome such drawbacks, such network-based approaches are less widely used within the life science community. In this article, we aim to provide a clear methodological framework on the use of such approaches in order to turn cancer research results into clinical applications. Therefore, we first discuss the rationale and the practical usage of three recently proposed network-based Cox regression algorithms (i.e., Net-Cox, AdaLnet, and fastcox). Then, we show how to combine existing biological knowledge and available data with such algorithms to identify networks of cancer biomarkers and to estimate survival of patients. Finally, we describe in detail a new permutation-based approach to better validate the significance of the selection in terms of cancer gene signatures and pathway/networks identification. We illustrate the proposed methodology by means of both simulations and real case studies. Overall, the aim of our work is two-fold. Firstly, to show how network-based Cox regression models can be used to integrate biological knowledge (e.g., multi-omics data) for the analysis of survival data. Secondly, to provide a clear methodological and computational approach for investigating cancers regulatory networks.

Highlights

  • Recent developments in high-throughput technology have produced a huge amount of multiple and diverse genomescale data to deal with biological and clinical questions in cancer

  • We present a methodological framework for the analysis of molecular and survival data through a crossvalidated approach of network-based Cox regression algorithms (Net-Cox, Adalnet, and fastcox, see Section Methods)

  • We considered two scenarios that are likely to be encountered in genomic studies and we simulated gene expression data as network constrained

Read more

Summary

Introduction

Recent developments in high-throughput technology have produced a huge amount of multiple and diverse genomescale data to deal with biological and clinical questions in cancer. Such amount of data is likely to revolutionize genetics and biomedical cancer research, but a thorough integration of all these different types of information is necessary. It has been found that Cetuximab can work only if the K-RAS gene is not mutated Another example is the anti-cancer drug Trastuzumab (Hudis, 2007), which is effective only in patients that highly express the human epidermal growth factor (HER2) at the cell surface, to which the antibody binds. These examples highlight the need of identifying stable and interpretable biomarkers able to predict patient survival and characterize a patient-personalized therapy. By discovering associations within the data, big data analytics has the potential to improve care, save lives, and lower costs

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.