Abstract
Recently, rotation forest has been extended to regression and survival analysis problems. However, due to intensive computation incurred by principal component analysis, rotation forest often fails when high-dimensional or big data are confronted. In this study, we extend rotation forest to high dimensional censored time-to-event data analysis by combing random subspace, bagging and rotation forest. Supported by proper statistical analysis, we show that the proposed method random rotation survival forest outperforms state-of-the-art survival ensembles such as random survival forest and popular regularized Cox models.
Highlights
Survival analysis of censored data plays a vital role in statistics with abundant applications in various fields such as biostatistics, engineering, finance and economics
In view of the fact that dimensionality reduction can be achieved by random subspace (Ho 1998) method which randomly selects a small number of dimensions from a given covariate set in building a base model, we propose a new survival ensemble called random rotation survival forest (RRotSF) for analyzing high-dimensional survival data
In the following datasets, when distant metastasis-free survival (DMFS) time values are available, DMFS values are used as the primary survival end-points, otherwise relapse-free or overall survival time values are applied
Summary
Survival analysis of censored data plays a vital role in statistics with abundant applications in various fields such as biostatistics, engineering, finance and economics. The past two decades have seen various survival ensembles with parametric and/or non-parametric base models and combining techniques. These techniques include bagging (Hothorn et al 2004, 2006), boosting (Binder and Schumacher 2008; Binder et al 2009; Hothorn and Bühlmann 2006; Li and Luan 2005; Ma and Huang 2007; Ridgeway 1999; Wang and Wang 2010), random survival forest (RSF) (Ishwaran et al 2010, 2011) and the recently proposed rotation survival forest (RotSF) (Zhou et al 2015). Bagging stochastically changes the distribution of the training data by constructing a base survival model based on different bootstrap samples (Hothorn et al 2004). Boosting based approaches adaptively change the distribution of the training data according to the
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.