Abstract
BackgroundNeuroblastoma is the most common tumor of early childhood and is notorious for its high variability in clinical presentation. Accurate prognosis has remained a challenge for many patients. In this study, expression profiles from RNA-sequencing are used to predict survival times directly. Several models are investigated using various annotation levels of expression profiles (genes, transcripts, and introns), and an ensemble predictor is proposed as a heuristic for combining these different profiles.ResultsThe use of RNA-seq data is shown to improve accuracy in comparison to using clinical data alone for predicting overall survival times. Furthermore, clinically high-risk patients can be subclassified based on their predicted overall survival times. In this effort, the best performing model was the elastic net using both transcripts and introns together. This model separated patients into two groups with 2-year overall survival rates of 0.40±0.11 (n=22) versus 0.80±0.05 (n=68). The ensemble approach gave similar results, with groups 0.42±0.10 (n=25) versus 0.82±0.05 (n=65). This suggests that the ensemble is able to effectively combine the individual RNA-seq datasets.ConclusionsUsing predicted survival times based on RNA-seq data can provide improved prognosis by subclassifying clinically high-risk neuroblastoma patients.ReviewersThis article was reviewed by Subharup Guha and Isabel Nepomuceno.
Highlights
Neuroblastoma is the most common tumor of early childhood and is notorious for its high variability in clinical presentation
Since p > n, ordinary least squares (OLS) should not be used as it will overfit on the data
Each of the dimension reduction techniques require the selection of one or more tuning parameters. These parameters are determined by 10-fold cross validation, which is implemented in R using two packages discussed
Summary
Neuroblastoma is the most common tumor of early childhood and is notorious for its high variability in clinical presentation. Neuroblastoma is the most frequently diagnosed cancer in the first year of life and the most common extracranial solid tumor in children. It accounts for 5% of all pediatric cancer diagnoses and 10% of all pediatric oncology deaths [1]. These numbers have improved over the past decade, but accurate prognosis for the disease has remained a challenge [1]. In 1984, the MYCN oncogene was identified as a biomarker for clinically aggressive tumors [2] It has since been one of the most important markers for stratifying patients. While aberrations of these genes indicate an increased susceptibility to the disease, these markers are less useful for stratifying patients into risk groups after diagnosis
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have