Abstract

Survival analysis is a widely used method to establish a connection between a time to event outcome and a set of variables. The goal of this work is to improve the accuracy of the widely applied parametric survival models. This work highlights that accurate and interpretable survival analysis models can be identified by clustering-based exploration of the operating regions of local survival models. The key idea is that when operating regions of local Weibull distributions are represented by Gaussian mixture models, the parameters of the mixture-of-Weibull model can be identified by a clustering algorithm. The proposed method is utilised in three case studies. The examples cover studying the dropout rate of university students, calculating the remaining useful life of lithium-ion batteries, and determining the chances of survival of prostate cancer patients. The results demonstrate the wide applicability of the method and the benefits of clustering-based identification of local Weibull models.

Highlights

  • The goal of the traditional survival analysis is to calculate the probability that the amount of time remaining is longer than a certain value [1]

  • The enabling idea of this tendency is that a smart analogy has to be set up between the survival time and the examined problem [3]

  • A trivial analogy comes from the field of degradation analysis, which focus on calculating the remaining useful life (RUL) of technical systems [4]

Read more

Summary

INTRODUCTION

The goal of the traditional survival analysis is to calculate the probability that the amount of time remaining is longer than a certain value [1]. It is common practice to identify the domain of interpretability of models, including local ones by targeted clustering algorithms [8]. We examined how this concept can be used to identify survival analysis models. Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS and Weibull-distributed survival times as well as simultaneously estimate the cumulative distribution function of the local models. The proposed methodology can accurately handle continuous variables because the local methods are represented by fuzzy logic. This method can simultaneously deal with discrete variables.

GAUSSIAN MIXTURE OF SURVIVAL MODELS
ESTIMATION OF THE MODEL PARAMETERS
DETERMINING THE NUMBER OF CLUSTERS BY AKAIKE INFORMATION CRITERION
SUMMARY OF THE PROPOSED METHOD
EXAMINATION OF STUDENT DROPOUT
ESTIMATION OF REMAINING USEFUL LIFE
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call