Dynamic prediction with penalized joint frailty model of high-dimensional recurrent event data and a survival outcome

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Dynamic prediction with penalized joint frailty model of high-dimensional recurrent event data and a survival outcome

Similar Papers
  • Research Article
  • Cite Count Icon 5
  • 10.1080/14017431.2019.1645349
First and recurrent events after percutaneous coronary intervention: implications for survival analyses
  • Jul 25, 2019
  • Scandinavian Cardiovascular Journal
  • Anupama Vasudevan + 12 more

Objectives. Using composite endpoints and/or only first events in clinical research result in information loss and alternative statistical methods which incorporate recurrent event data exist. We compared information-loss under traditional analyses to alternative models. Design. We conducted a retrospective analysis of patients who underwent percutaneous coronary intervention (Jan2010-Dec2014) and constructed Cox models for a composite endpoint (readmission/death), a shared frailty model for recurrent events, and a joint frailty (JF) model to simultaneously account for recurrent and terminal events and evaluated the impact of heart failure (HF) on the outcome. Results. Among 4901 patients, 2047(41.8%) experienced a readmission or death within 1 year. Of those with recurrent events, 60% had ≥1 readmission and 6% had >4; a total of 121(2.5%) patients died during follow-up. The presence of HF conferred an adjusted Hazard ratio (HR) of 1.32 (95% CI: 1.18–1.47, p < .001) for the risk of composite endpoint (Cox model), 1.44 (95% CI: 1.36–1.52, p < .001) in the frailty model, and 1.34 (95% CI:1.22–1.46, p < .001) in the JF model. However, HF was not associated with death (HR 0.87, 95% CI: 0.52–1.48, p = .61) in the JF model. Conclusions. Using a composite endpoint and/or only the first event yields substantial loss of information, as many individuals endure >1 event. JF models reduce bias by simultaneously providing event-specific HRs for recurrent and terminal events.

  • Research Article
  • Cite Count Icon 2
  • 10.1002/sim.9235
Response‐adaptive treatment allocation for clinical studies with recurrent event and terminal event data
  • Oct 24, 2021
  • Statistics in Medicine
  • Pei‐Fang Su

In long-term clinical studies, recurrent event data are frequently collected to contrast the efficacy of two different treatments. However, the recurrent event process can be stopped by a terminal event, such as death. For analyzing recurrent event and terminal event data, joint frailty modeling has recently received considerable attention because it makes it possible to study the joint evolution over time of both recurrent and terminal event processes and gives consistent and efficient parameters. For a two-arm clinical trial design based on these data sets, there has been limited research on investigating the balanced design, let alone adaptive treatment allocation. Although equal sample size allocation obtained for both treatments is intuitively first adopted in a trial design, if one treatment is expected to be superior, it may be desirable to allocate more subjects to the effective treatment. In this article, we calculate the required sample size based on restricted randomization and then propose a target response-adaptive randomization procedure for recurrent and terminal event outcomes based on the joint frailty model. A randomization procedure, the doubly adaptive biased coin design that targets some optimal allocations, is implemented. The proposed adaptive treatment allocation schemes have been shown to be capable of reducing the number of trial participants who receive inferior treatment while simultaneously reaching an optimal target, as well as retaining a comparable test power as compared to a restricted randomization design. Finally, two clinical studies, the COAPT trial and the A-HeFT trial, are used to illustrate the advantages of adopting the proposed procedure.

  • Research Article
  • Cite Count Icon 2
  • 10.1186/s12874-024-02418-9
Penalized landmark supermodels (penLM) for dynamic prediction for time-to-event outcomes in high-dimensional data
  • Jan 27, 2025
  • BMC Medical Research Methodology
  • Anya H Fries + 2 more

BackgroundTo effectively monitor long-term outcomes among cancer patients, it is critical to accurately assess patients’ dynamic prognosis, which often involves utilizing multiple data sources (e.g., tumor registries, treatment histories, and patient-reported outcomes). However, challenges arise in selecting features to predict patient outcomes from high-dimensional data, aligning longitudinal measurements from multiple sources, and evaluating dynamic model performance.MethodsWe provide a framework for dynamic risk prediction using the penalized landmark supermodel (penLM) and develop novel metrics ( and ) to evaluate and summarize model performance across different timepoints. Through simulations, we assess the coverage of the proposed metrics’ confidence intervals under various scenarios. We applied penLM to predict the updated 5-year risk of lung cancer mortality at diagnosis and for subsequent years by combining data from SEER registries (2007–2018), Medicare claims (2007–2018), Medicare Health Outcome Survey (2006–2018), and U.S. Census (1990–2010).ResultsThe simulations confirmed valid coverage (~ 95%) of the confidence intervals of the proposed summary metrics. Of 4,670 lung cancer patients, 41.5% died from lung cancer. Using penLM, the key features to predict lung cancer mortality included long-term lung cancer treatments, minority races, regions with low education attainment or racial segregation, and various patient-reported outcomes beyond cancer staging and tumor characteristics. When evaluated using the proposed metrics, the penLM model developed using multi-source data (of 0.77 [95% confidence interval: 0.74–0.79]) outperformed those developed using single-source data (range: 0.50–0.74).ConclusionsThe proposed penLM framework with novel evaluation metrics offers effective dynamic risk prediction when leveraging high-dimensional multi-source longitudinal data.

  • Research Article
  • 10.2196/86327
Deep Learning for Dynamic Prognostic Prediction in Minimally Invasive Surgery for Intracerebral Hemorrhage: Model Development and Validation Study.
  • Jan 7, 2026
  • JMIR medical informatics
  • Jingxuan Wang + 8 more

The pathological and physiological state of patients with intracerebral hemorrhage (ICH) after minimally invasive surgery (MIS) is a dynamic evolution, and the traditional models cannot dynamically predict prognosis. Clinical data at multiple time points often show the characteristics of different categories, different numbers, and missing data. The existing models lack methods to deal with imbalanced data. This study aims to develop and validate a dynamic prognostic model using multi-time point data from patients with ICH undergoing MIS to predict survival and functional outcomes. In this study, 287 patients who underwent MIS for ICH were retrospectively collected on the day of surgery, days 1, 3, 7, and 14 after surgery, and the day of drainage tube removal. Their general information, vital signs, laboratory test findings, neurological function scores, head hematoma volume, and MIS-related indicators were collected. In addition, this study proposes a multistep attention model, namely the MultiStep Transformer. The model can simultaneously output 3 types of prediction probabilities for 30-day survival probability, 180-day survival probability, and 180-day favorable functional outcome (modified Rankin Scale [mRS] 0-3) probability. Five-fold cross-validation was used to evaluate the performance of the model and compare it with mainstream models and traditional scores. The main evaluation indexes included accuracy, precision, recall, and F1-score. The predictive performance of the model was evaluated using receiver operating characteristic (ROC) curves; its calibration was assessed via calibration curves; and its clinical utility was examined using decision curve analysis (DCA). Attributable value analysis was conducted to assess the key predictive features. The 30‑day survival rate, 180‑day survival rate, and 180‑day favorable functional outcome rate among 287 patients were 92.3%, 88.8%, and 52.3%, respectively. In terms of predictive efficacy for survival and functional outcomes, the MultiStep Transformer model showed a remarkable superiority over traditional scoring systems and other deep learning models. For these three outcomes, the model achieved areas under the receiver operating characteristic curves (AUROCs) of 0.87 (95% CI 0.82-0.92), 0.85 (95% CI 0.77-0.93), and 0.75 (95% CI 0.72-0.78), with corresponding Brier scores of 0.1041, 0.1115, and 0.231. DCA confirmed that the model provided a definite clinical net benefit when threshold probabilities ranged within 0.06-0.26, 0.04-0.5, and 0.21-0.71. The MultiStep Transformer model proposed in this study can effectively use imbalanced data to construct a model. It possesses good dynamic prediction ability for short-term and long-term survival and functional outcome of patients with ICH undergoing MIS, providing a novel tool for individualized assessment of prognosis among patients with ICH undergoing MIS.

  • Research Article
  • Cite Count Icon 4
  • 10.1080/24709360.2019.1693198
Dynamic prediction using joint models of longitudinal and recurrent event data: a Bayesian perspective
  • Nov 22, 2019
  • Biostatistics &amp; Epidemiology
  • Xuehan Ren + 2 more

In cardiovascular disease (CVD) studies, the events of interest may be recurrent (multiple occurrences from the same individual). During the study follow-up, longitudinal measurements are often available and these measurements are highly predictive of event recurrences. It is of great clinical interest to make personalized prediction of the next occurrence of recurrent events using the available clinical information, because it enables clinicians to make more informed and personalized decisions and recommendations. To this end, we propose a joint model of longitudinal and recurrent event data. We develop a Bayesian approach for model inference and a dynamic prediction framework for predicting target subjects' future outcome trajectories and risk of next recurrent event, based on their data up to the prediction time point. To improve computation efficiency, embarrassingly parallel MCMC (EP-MCMC) method is utilized. It partitions the data into multiple subsets, runs MCMC sampler on each subset, and applies random partition trees to combine the posterior draws from all subsets. Our method development is motivated by and applied to the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT), one of the largest CVD studies to compare the effectiveness of medications to treat hypertension.

  • Research Article
  • 10.18502/jbe.v8i3.12306
Joint Frailty Model of Recurrent and Terminal Events in the Presence of Cure Fraction using a Bayesian Approach
  • Mar 17, 2023
  • Journal of Biostatistics and Epidemiology
  • Zahra Arab Borzu + 5 more

Introduction: Recurrent event data are common in many longitudinal studies. Often, a terminating event such as death can be correlated with the recurrent event process. A shared frailty model applied to account for the association between recurrent and terminal events. In some situations, a fraction of subjects experience neither recurrent events nor death; these subjects are cured.&#x0D; Methods: In this paper, we discussed the Bayesian approach of a joint frailty model for recurrent and terminal events in the presence of cure fraction. We compared estimates of parameters in the Frequentist and Bayesian approaches via simulation studies in various sample sizes; we applied the joint frailty model in the presence of cure fraction with Frequentist and Bayesian approaches for breast cancer.&#x0D; Results: In small sample size Bayesian approach compared to Frequentist approach had a smaller standard error and mean square error, and the coverage probabilities close to nominal level of 95%. Also, in Bayesian approach, the sampling means of the estimated standard errors were close to the empirical standard error.&#x0D; Conclusion: The simulation results suggested that when sample size was small, the use of Bayesian joint frailty model in the presence of cure fraction led to more efficiency in parameter estimation and statistical inference.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 13
  • 10.1038/s41598-020-66466-z
Bayesian Hyper-LASSO Classification for Feature Selection with Application to Endometrial Cancer RNA-seq Data
  • Jun 16, 2020
  • Scientific Reports
  • Lai Jiang + 3 more

Feature selection is demanded in many modern scientific research problems that use high-dimensional data. A typical example is to identify gene signatures that are related to a certain disease from high-dimensional gene expression data. The expression of genes may have grouping structures, for example, a group of co-regulated genes that have similar biological functions tend to have similar expressions. Thus it is preferable to take the grouping structure into consideration to select features. In this paper, we propose a Bayesian Robit regression method with Hyper-LASSO priors (shortened by BayesHL) for feature selection in high dimensional genomic data with grouping structure. The main features of BayesHL include that it discards more aggressively unrelated features than LASSO, and it makes feature selection within groups automatically without a pre-specified grouping structure. We apply BayesHL in gene expression analysis to identify subsets of genes that contribute to the 5-year survival outcome of endometrial cancer (EC) patients. Results show that BayesHL outperforms alternative methods (including LASSO, group LASSO, supervised group LASSO, penalized logistic regression, random forest, neural network, XGBoost and knockoff) in terms of predictive power, sparsity and the ability to uncover grouping structure, and provides insight into the mechanisms of multiple genetic pathways leading to differentiated EC survival outcome.

  • Research Article
  • Cite Count Icon 63
  • 10.1109/tcbb.2012.63
Gene Selection Using Iterative Feature Elimination Random Forests for Survival Outcomes
  • Sep 1, 2012
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • Herbert Pang + 3 more

Although many feature selection methods for classification have been developed, there is a need to identify genes in high-dimensional data with censored survival outcomes. Traditional methods for gene selection in classification problems have several drawbacks. First, the majority of the gene selection approaches for classification are single-gene based. Second, many of the gene selection procedures are not embedded within the algorithm itself. The technique of random forests has been found to perform well in high-dimensional data settings with survival outcomes. It also has an embedded feature to identify variables of importance. Therefore, it is an ideal candidate for gene selection in high-dimensional data with survival outcomes. In this paper, we develop a novel method based on the random forests to identify a set of prognostic genes. We compare our method with several machine learning methods and various node split criteria using several real data sets. Our method performed well in both simulations and real data analysis.Additionally, we have shown the advantages of our approach over single-gene-based approaches. Our method incorporates multivariate correlations in microarray data for survival outcomes. The described method allows us to better utilize the information available from microarray data with survival outcomes.

  • Research Article
  • Cite Count Icon 13
  • 10.1016/j.cmpb.2019.105259
Computational issues in fitting joint frailty models for recurrent events with an associated terminal event
  • Dec 2, 2019
  • Computer Methods and Programs in Biomedicine
  • Gerrit Toenges + 1 more

Computational issues in fitting joint frailty models for recurrent events with an associated terminal event

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 62
  • 10.1186/s12874-021-01375-x
Random survival forests for dynamic predictions of a time-to-event outcome using a longitudinal biomarker
  • Oct 17, 2021
  • BMC Medical Research Methodology
  • Kaci L Pickett + 4 more

BackgroundRisk prediction models for time-to-event outcomes play a vital role in personalized decision-making. A patient’s biomarker values, such as medical lab results, are often measured over time but traditional prediction models ignore their longitudinal nature, using only baseline information. Dynamic prediction incorporates longitudinal information to produce updated survival predictions during follow-up. Existing methods for dynamic prediction include joint modeling, which often suffers from computational complexity and poor performance under misspecification, and landmarking, which has a straightforward implementation but typically relies on a proportional hazards model. Random survival forests (RSF), a machine learning algorithm for time-to-event outcomes, can capture complex relationships between the predictors and survival without requiring prior specification and has been shown to have superior predictive performance.MethodsWe propose an alternative approach for dynamic prediction using random survival forests in a landmarking framework. With a simulation study, we compared the predictive performance of our proposed method with Cox landmarking and joint modeling in situations where the proportional hazards assumption does not hold and the longitudinal marker(s) have a complex relationship with the survival outcome. We illustrated the use of the RSF landmark approach in two clinical applications to assess the performance of various RSF model building decisions and to demonstrate its use in obtaining dynamic predictions.ResultsIn simulation studies, RSF landmarking outperformed joint modeling and Cox landmarking when a complex relationship between the survival and longitudinal marker processes was present. It was also useful in application when there were several predictors for which the clinical relevance was unknown and multiple longitudinal biomarkers were present. Individualized dynamic predictions can be obtained from this method and the variable importance metric is useful for examining the changing predictive power of variables over time. In addition, RSF landmarking is easily implementable in standard software and using suggested specifications requires less computation time than joint modeling.ConclusionsRSF landmarking is a nonparametric, machine learning alternative to current methods for obtaining dynamic predictions when there are complex or unknown relationships present. It requires little upfront decision-making and has comparable predictive performance and has preferable computational speed.

  • Research Article
  • 10.6000/1929-6029.2023.12.25
Joint Frailty Mixing Model for Recurrent Event Data with an Associated Terminal Event: Application to Hospital Readmission Data
  • Nov 24, 2023
  • International Journal of Statistics in Medical Research
  • Goutam Barman + 3 more

Recurrent events like repeated hospitalization, cancer tumour recurrences, and many others occur frequently. The follow-up on recurrent events may be stopped by a terminal event like death. It is obvious that if the frequencies of recurrent events are more, then it may lead to a terminal event and in this case terminal event becomes ‘dependent’. In this article, we study a joint modelling and analysis of recurrent events with a dependent terminal event. Here, the proportional intensity model for the recurrent events process and the proportional hazard model for the terminal event time are taken. To account for the association between recurrent events and terminal events, mixing frailty or random effect is studied rather than available pure frailty. In our case, the distribution of frailty is introduced as a mixture of folded normal distribution and gamma distribution rather than using pure gamma distribution. An estimation procedure in the joint frailty model is applied to estimate the parameters of the model. This method is close to the method of minimum chi-square rather than a complicated one. An extensive simulation study has been performed to estimate the model parameters and the performances are evaluated based on bias and MSE criteria. Further from an application point of view, the method is illustrated to a hospital readmission data for colorectal cancer patients.

  • Research Article
  • Cite Count Icon 3
  • 10.1002/sim.6784
Covariate dimension reduction for survival data via the Gaussian process latent variable model.
  • Nov 3, 2015
  • Statistics in Medicine
  • James E Barrett + 1 more

The analysis of high-dimensional survival data is challenging, primarily owing to the problem of overfitting, which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here, we propose a novel method of extracting a low-dimensional representation of covariates in survival data by combining the popular Gaussian process latent variable model with a Weibull proportional hazards model. The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low-dimensional structure from high-dimensional data. By reducing the covariate dimension, we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low-dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low-dimensional representation that allows us to distinguish high-risk and low-risk groups with superior accuracy compared with doing regression on the original high-dimensional data.

  • Research Article
  • Cite Count Icon 6
  • 10.1007/s00184-016-0577-9
A new joint model of recurrent event data with the additive hazards model for the terminal event time
  • Apr 1, 2016
  • Metrika
  • Xiaoyu Che + 1 more

Recurrent event data are frequently encountered in clinical and observational studies related to biomedical science, econometrics, reliability and demography. In some situations, recurrent events serve as important indicators for evaluating disease progression, health deterioration, or insurance risk. In statistical literature, non informative censoring is typically assumed when statistical methods and theories are developed for analyzing recurrent event data. In many applications, however, there may exist a terminal event, such as death, that stops the follow-up, and it is the correlation of this terminal event with the recurrent event process that is of interest. This work considers joint modeling and analysis of recurrent event and terminal event data, with the focus primarily on determining how the terminal event process and the recurrent event process are correlated (i.e. does the frequency of the recurrent event influence the risk of the terminal event). We propose a joint model of the recurrent event process and the terminal event, linked through a common subject-specific latent variable, in which the proportional intensity model is used for modeling the recurrent event process and the additive hazards model is used for modeling the terminal event time.

  • Research Article
  • Cite Count Icon 48
  • 10.1002/sim.5980
Dynamic prediction of risk of death using history of cancer recurrences in joint frailty models
  • Sep 13, 2013
  • Statistics in Medicine
  • Audrey Mauguen + 5 more

Evaluating the prognosis of patients according to their demographic, biological, or disease characteristics is a major issue, as it may be used for guiding treatment decisions. In cancer studies, typically, more than one endpoint can be observed before death. Patients may undergo several types of events, such as local recurrences and distant metastases, with death as the terminal event. Accuracy of clinical decisions may be improved when the history of these different events is considered. Thus, it may be useful to dynamically predict patients' risk of death using recurrence history. As previously applied within the framework of joint models for longitudinal and time to event data, we propose a dynamic prediction tool based on joint frailty models. Joint modeling accounts for the dependence between recurrent events and death, by the introduction of a random effect shared by the two processes. We estimate the probability of death between the prediction time t and a horizon t + w, conditional on information available at time t. Prediction can be updated with the occurrence of a new event. We proposed and compared three prediction settings, taking into account three different information levels. The proposed tools are applied to patients diagnosed with a primary invasive breast cancer and treated with breast-conserving surgery, followed for more than 10 years in a French comprehensive cancer center.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.1007/s40840-022-01300-5
Comparison of Joint Modelling and Landmarking Approaches for Dynamic Prediction Using Bootstrap Simulation
  • May 26, 2022
  • Bulletin of the Malaysian Mathematical Sciences Society
  • Zakir Hossain + 1 more

Prediction models for clinical outcomes can greatly help clinicians with early diagnosis, cost-effective management and primary prevention of many medical conditions. In conventional prediction models, predictors are typically measured at a fixed time point, either at baseline or at other time point of interest such as biomarker values measured at the most recent follow-up. Dynamic prediction has emerged as a more appealing prediction technique that takes account of longitudinal history of biomarkers for making predictions. We compared prediction performance of two well-known approaches for dynamic prediction, namely joint modelling and landmarking, using bootstrap simulation based on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data with repeat Mini-Mental State Examination (MMSE) scores as the longitudinal biomarker and time-to-Alzheimer’s disease (AD) as the survival outcome. We assessed the performance of both approaches in terms of extended definitions of discrimination and calibration, namely dynamic area under the receiver operating characteristic curve (dynAUC) and expected prediction error (PE). We focused on real data-based bootstrap simulation in an attempt to be as impartial as possible to both methods as landmarking is a pragmatic approach which does not specify a statistical model for the longitudinal markers, and therefore any comparison based on model based data simulation may potentially be more advantageous to joint modelling approach. The dynAUC and PE were compared at landmarks t_{s}=1.0, 1.5, 2.0,text{ and },2.5 years and within a 2-year window from the landmark time points. The optimism corrected estimates of dynAUC for joint modelling were slightly higher (1.26, 3.22, 2.76 and 0.12% higher at the four landmark time points) than that of landmarking approach. Apart from the final landmark point (at 2.5 years), dynamic prediction based on joint models has also performed slightly better in terms of calibration. The expected prediction errors (PE) for joint models were 0.70, 2.56 and 2.04% lower at the first three landmark time points, respectively, compared to the landmarking approach. In general, joint modelling approach has performed better than the landmarking approach in terms of both discrimination (dynAUC) and calibration (PE), although the margin of gain in performance by using joint models over landmarking was relatively small indicating that landmarking approach was close enough, despite not having a precise statistical model characterising the evolution of the longitudinal markers. Future comparative studies should consider extended versions of joint modelling and landmarking approaches which may overcome some of the limitations of the standard methods.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.