An online framework for survival analysis: reframing Cox proportional hazards model for large data sets and neural networks.

Aliasghar Tarkhan,Noah Simon

doi:10.1093/biostatistics/kxac039

Abstract

In many biomedical applications, outcome is measured as a "time-to-event" (e.g., disease progression or death). To assess the connection between features of a patient and this outcome, it is common to assume a proportional hazards model and fit a proportional hazards regression (or Cox regression). To fit this model, a log-concave objective function known as the "partial likelihood" is maximized. For moderate-sized data sets, an efficient Newton-Raphson algorithm that leverages the structure of the objective function can be employed. However, in large data sets this approach has two issues: (i) The computational tricks that leverage structure can also lead to computational instability; (ii) The objective function does not naturally decouple: Thus, if the data set does not fit in memory, the model can be computationally expensive to fit. This additionally means that the objective is not directly amenable to stochastic gradient-based optimization methods. To overcome these issues, we propose a simple, new framing of proportional hazards regression: This results in an objective function that is amenable to stochastic gradient descent. We show that this simple modification allows us to efficiently fit survival models with very large data sets. This also facilitates training complex, for example, neural-network-based, models with survival data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An online framework for survival analysis: reframing Cox proportional hazards model for large data sets and neural networks.

Abstract

Talk to us

Similar Papers

More From: Biostatistics

Lead the way for us

Journal: Biostatistics	Publication Date: Oct 26, 2022
Citations: 3

Similar Papers

Protein Identification False Discovery Rates for Very Large Proteomics Data Sets Generated by Tandem Mass Spectrometry
Lukas Reiter ... Ruedi Aebersold
Molecular & Cellular Proteomics | VOL. 8
Lukas Reiter, et. al.Lukas Reiter ... Ruedi Aebersold
01 Nov 2009
Molecular & Cellular Proteomics | VOL. 8

A clustering method for very large mixed data sets
G Sanchez-Diaz ... J Ruiz-Shulcloper
-
G Sanchez-Diaz, et. al.G Sanchez-Diaz ... J Ruiz-Shulcloper
29 Nov 2001
29 Nov 2001

An artificial neural network improves prediction of observed survival in patients with laryngeal squamous carcinoma
Andrew S Jones ... Anthony C Fisher
European Archives of Oto-Rhino-Laryngology | VOL. 263
Andrew S Jones, et. al.Andrew S Jones ... Anthony C Fisher
05 May 2006
European Archives of Oto-Rhino-Laryngology | VOL. 263

Fast and accurate supertrees: towards large scale phylogenies

-

01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An online framework for survival analysis: reframing Cox proportional hazards model for large data sets and neural networks.

Abstract

Talk to us

Similar Papers

More From: Biostatistics