Abstract

Individuals may experience repeated events over time. However, there is no consensus about learning approaches to use in a high-dimensional framework for survival data (when the number of variables exceeds the number of individuals, i.e. p > n). The aim of this study was to identify learning algorithms for analyzing/predicting recurrent events, and to compare them to standard statistical models on simulated data. A systematic literature review was conducted to provide state-of-the-art methodology. Data were then simulated according to the number of variables, the proportion of active variables, and the number of events. The performance of the models was assessed using Harrell’s concordance index, Kim’s C-index, and error rate for active variables. Seven publications were identified, of which four were methodological studies, one an application paper and two were reviews. On simulated data, the standard models failed when p > n. Penalized Andersen–Gill and frailty models outperformed, whereas RankDeepSurv gave poorer performances. With no current guidelines on a specific approach to use, this study deepens understanding of the mechanisms and limits of investigated methods in this context.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call