Abstract

Recent advances in high-throughput genomic technologies have nurtured a growing demand for statistical tools to facilitate identification of molecular changes as potential prognostic biomarkers or drugable targets for personalized precision medicine. In this study, we developed a web-based interactive and user-friendly platform for high-dimensional analysis of molecular alterations in cancer (HDMAC) (https://ripsung26.shinyapps.io/rshiny/). On HDMAC, several penalized regression models that are suitable for high-dimensional data analysis, Ridge, Lasso and adaptive Lasso, are offered, with Cox regression for survival and logistic regression for binary outcomes. Choice of a first-step screening is provided to address the multiple-comparison issue that often arises with large-volume genomic data. Hazard ratio or estimated coefficient is provided with each selected gene so that a multivariate regression model may be built based on the genes selected. Cross validation is provided as the method to estimate the prediction power of each regression model. In addition, R codes are also provided to facilitate download of whole sets of molecular variables from TCGA. In this study, illustration of the use of HDMAC was made through a set of data on gene mutations and a set on mRNA expression from ovarian cancer patients and a set on mRNA expression from bladder cancer patient. From the analysis of each set of data, a list of candidate genes was obtained that might be associated with mutations or abnormal expression of genes in ovarian and bladder cancers. HDMAC offers a solution for rigorous and validation analysis of high-dimensional genomic data.

Highlights

  • There are several web tools available for researchers to analyze genomic data

  • The data are (xi, yi), i = 1, ..., n, where xi = is the covariate of the ith subject such as copy number variation (CNV), gene expression and mutation (M is the number of genes) and yi is the binary response for the ith subject such as stage and tumor subtype

  • We constructed a package of high-dimensional analysis of molecular alterations in cancer, HDMAC, and made it a web-based platform at https://ripsung26.shinyapps.io/rshiny/

Read more

Summary

Introduction

There are several web tools available for researchers to analyze genomic data. For example, cBioPortal provides simultaneous display of RNA expression, mutations, copy number alterations and protein expression with multiple choices of plots for visualization[5,6]. Both the Lasso and the adaptive Lasso can be used for variable selection, with the latter selecting fewer variables than the former These penalized regression methods have been widely used in large-scale genetic studies in recent years, such as identification of gene-gene interactions, gene selection in a high-dimensional cancer classification problem and a transcriptome analysis of pancreatic cancer survival[18,19,20]. We aimed to develop a web-based interactive and user-friendly platform to fulfill the following goals It would fit the regression models with survival and binary outcomes and high-dimensional genetic covariates, with the option of including clinical covariates. We aimed to provide all relevant codes on GitHub for users’ convenience

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call