Abstract

BackgroundCensored data are increasingly common in many microarray studies that attempt to relate gene expression to patient survival. Several new methods have been proposed in the last two years. Most of these methods, however, are not available to biomedical researchers, leading to many re-implementations from scratch of ad-hoc, and suboptimal, approaches with survival data.ResultsWe have developed SignS (Signatures for Survival data), an open-source, freely-available, web-based tool and R package for gene selection, building molecular signatures, and prediction with survival data. SignS implements four methods which, according to existing reviews, perform well and, by being of a very different nature, offer complementary approaches. We use parallel computing via MPI, leading to large decreases in user waiting time. Cross-validation is used to asses predictive performance and stability of solutions, the latter an issue of increasing concern given that there are often several solutions with similar predictive performance. Biological interpretation of results is enhanced because genes and signatures in models can be sent to other freely-available on-line tools for examination of PubMed references, GO terms, and KEGG and Reactome pathways of selected genes.ConclusionSignS is the first web-based tool for survival analysis of expression data, and one of the very few with biomedical researchers as target users. SignS is also one of the few bioinformatics web-based applications to extensively use parallelization, including fault tolerance and crash recovery. Because of its combination of methods implemented, usage of parallel computing, code availability, and links to additional data bases, SignS is a unique tool, and will be of immediate relevance to biomedical researchers, biostatisticians and bioinformaticians.

Highlights

  • Censored data are increasingly common in many microarray studies that attempt to relate gene expression to patient survival

  • Many of these papers have emphasized gene selection and survival prediction, and "signature finding": discovering sets of correlated genes that are relevant for survival prediction

  • CV runs, as well as tables with number of common genes in different runs. The list of these signatures and genes can be sent to our application PaLS [52] to examine PubMed references, Gene Ontology terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways or Reactome pathways that are common to a user-selected percentage of genes and/or signatures

Read more

Summary

Introduction

Censored data are increasingly common in many microarray studies that attempt to relate gene expression to patient survival. Several new methods have been proposed in the last two years Most of these methods, are not available to biomedical researchers, leading to many re-implementations from scratch of ad-hoc, and suboptimal, approaches with survival data. In the last two years there has been an increase in the number of new methods proposed for this kind of data [1,2,3,4,5,6,7,8,9,10,11] Many of these papers have emphasized gene selection and survival prediction, and "signature finding": discovering sets of correlated genes that are relevant for survival prediction. For end-users (e.g., biomedical researchers with microarray data for a sample of patients for which survival is known), most of these methods are not accessible, which might explain why many papers in the (page number not for citation purposes). Tools for end users are badly needed that, while retaining user-friendliness, do not compromise statistical rigor

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call