A Model Selection Approach for Variable Selection with Censored Data

María Eugenia Castellanos,Stefano Cabras,Gonzalo García-Donato

doi:10.1214/20-ba1207

Abstract

We consider the variable selection problem when the response is subject to censoring. A main particularity of this context is that information content of sampled units varies depending on the censoring times. Our approach is based on model selection where all 2k possible models are entertained and we adopt an objective Bayesian perspective where the choice of prior distributions is a delicate issue given the well-known sensitivity of Bayes factors to these prior inputs. We show that borrowing priors from the ‘uncensored’ literature may lead to unsatisfactory results as this default procedure implicitly assumes a uniform contribution of all units independently on their censoring times. In this paper, we develop specific methodology based on a generalization of the g-priors, explicitly addressing the particularities of survival problems arguing that it behaves comparatively better than standard approaches on the basis of arguments specific to variable selection problems (like e.g. predictive matching) in the particular case of the accelerated failure time model with lognormal errors. We apply the methodology to a recent large epidemiological study about breast cancer survival rates in Castellón, a province of Spain.

Highlights

Introduction and motivationIn variable selection we have k possible explanatory variables but it is uncertain which of these is relevant to explain the response
Our research is rooted in the Bayesian paradigm and more concisely on methods based on the posterior distribution that assigns to each candidate model its probability conditional on the observed data
An illustration of the potential misbehavior of such default procedures is presented in Section 3 where, we show how a group of experimental units with very small censoring times may severely modify the result of the variable selection exercise

Summary

Introduction and motivation

In variable selection we have k possible explanatory variables but it is uncertain which of these is relevant to explain the response. The developed ideas are potentially useful for other type of parametric or semiparametric models usually employed in survival analysis This family of priors, that has been deeply studied in Berger and Pericchi (2001); Bayarri and Garcıa-Donato (2007) has received much attention in the literature and has been extended to problems beyond the original Gaussian model to include various types of error distributions These methods are strongly based on the concept of minimal training sample (see Berger and Pericchi, 2004, for a review of the topic), whose definition is intriguing in problems with observations with different information content (as here) Strategies to circumvent these difficulties have been developed in the series of papers Perra et al (2013); Cabras et al (2014) and Cabras et al (2015), but these approaches are computationally intensive since an integral must be evaluated for every training sample and many integrals are needed for one model comparison.

The statistical model considered

Motivating example

General considerations

The prior covariance matrix

Properties of ΣM and the proposed prior

Computing Bayes factors

The approximated Bayes factor is

Predictive matching

Variable selection

A simulation study over heart transplant data

Findings

Further remarks

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bayesian Analysis	Publication Date: Apr 29, 2020
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

A Model Selection Approach for Variable Selection with Censored Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bayesian Analysis

Lead the way for us

Similar Papers

Narrow Money Demand in Indonesia and in Other Transitional Economies – Model Selection and Forecasting
...
EUROPEAN RESEARCH STUDIES JOURNAL | VOL. 23
, et. al. ...
01 Nov 2020
EUROPEAN RESEARCH STUDIES JOURNAL | VOL. 23

Variable Selection and Forecasting in High Dimensional Linear Regressions with Structural Breaks
... Mahrad Sharifvaghefi
Federal Reserve Bank of Dallas, Globalization Institute Working Papers | VOL. 2020
, et. al. ... Mahrad Sharifvaghefi
01 Aug 2020
Federal Reserve Bank of Dallas, Globalization Institute Working Papers | VOL. 2020

Variable Selection and Forecasting in High Dimensional Linear Regressions with Structural Breaks
...
Federal Reserve Bank of Dallas, Globalization Institute Working Papers | VOL. 2020
, et. al. ...
01 May 2021
Federal Reserve Bank of Dallas, Globalization Institute Working Papers | VOL. 2020

Variable selection and estimation in high-dimensional varying-coefficient models
Fengrong Wei ... Hongzhe Li
Statistica Sinica | VOL. 21
Fengrong Wei, et. al.Fengrong Wei ... Hongzhe Li
01 Oct 2011
Statistica Sinica | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Model Selection Approach for Variable Selection with Censored Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bayesian Analysis