Secondary Analysis under Cohort Sampling Designs Using Conditional Likelihood

Olli Saarela,Juha Karvanen,Sangita Kulathinal

doi:10.1155/2012/931416

Abstract

Under cohort sampling designs, additional covariate data are collected on cases of a specific type and a randomly selected subset of noncases, primarily for the purpose of studying associations with a time-to-event response of interest. With such data available, an interest may arise to reuse them for studying associations between the additional covariate data and a secondary non-time-to-event response variable, usually collected for the whole study cohort at the outset of the study. Following earlier literature, we refer to such a situation as secondary analysis. We outline a general conditional likelihood approach for secondary analysis under cohort sampling designs and discuss the specific situations of case-cohort and nested case-control designs. We also review alternative methods based on full likelihood and inverse probability weighting. We compare the alternative methods for secondary analysis in two simulated settings and apply them in a real-data example.

Highlights

Cohort sampling designs are two-phase epidemiological study designs where information on time-to-event outcomes of interest over a followup period and some basic covariate data are collected on the whole first-phase study group, referred to as a cohort, and in the second phase, more expensive or difficult-to-obtain additional covariate data are collected only on a subset of the study cohort
Examples are the case-cohort 1–3 and nested case-control 4, 5 designs. Such designs are applied for the purpose of studying associations between the time-to-event Journal of Probability and Statistics outcomes and the covariates collected in the second phase
Conditional likelihood inference under cohort sampling designs has been studied previously for the analysis of the primary time-to-event outcome by Langholz and Goldstein and Saarela and Kulathinal ; here, we extend these methods to the secondary analysis setting

Summary

Introduction

Cohort sampling designs are two-phase epidemiological study designs where information on time-to-event outcomes of interest over a followup period and some basic covariate data are collected on the whole first-phase study group, referred to as a cohort, and in the second phase, more expensive or difficult-to-obtain additional covariate data are collected only on a subset of the study cohort. Conditional likelihood inference under cohort sampling designs has been studied previously for the analysis of the primary time-to-event outcome by Langholz and Goldstein and Saarela and Kulathinal ; here, we extend these methods to the secondary analysis setting. Additional covariate data here the lactase persistence genotype Zi are collected only on the second-phase study group O ≡ {i : Ri 1} ⊆ C, specified by the inclusion indicators Ri ∈ {0, 1}, analogously to the survey response/nonresponse setting of Rubin 21. Observed data likelihoods may become sensitive to misspecification of the model for the response variable; the missing data can act to extra parameters, and the actual model parameters may lose their intended interpretation This is a real problem especially in cohort sampling designs with a rare event of interest, since the proportion of uncollected covariate data in the study cohort may be very high.

Methods

Definition

Conditional Likelihood Expression

Special Cases

Risk Set Sampling

Missing Second-Phase Covariate Data

Full Likelihood

Inverse Probability Weighting

Conditional Likelihood

Incident Outcomes and Left Truncation

Simulation Study

Multimodality under Full Likelihood When the Sampling Fraction Is Small

An Example with Real Data

Discussion

On Inverse-Probability-Weighted Pseudolikelihood Estimators

Relationship to Retrospective Likelihood

Mean and Variance of the Conditional Likelihood Score Function

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Probability and Statistics	Publication Date: Jan 1, 2012
Citations: 11	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Secondary Analysis under Cohort Sampling Designs Using Conditional Likelihood

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Probability and Statistics

Lead the way for us

Similar Papers

Comparison of statistical approaches for analyzing incomplete longitudinal patient-reported outcome data in randomized controlled trials
Ines Rombach ... Alastair Gray
Patient related outcome measures | VOL. Volume 9
Ines Rombach, et. al.Ines Rombach ... Alastair Gray
01 Jun 2018
Patient related outcome measures | VOL. Volume 9

On semiparametric transformation model with LTRC data: pseudo likelihood approach
...
Statistische Hefte | VOL. 62
, et. al. ...
02 Jan 2019
Statistische Hefte | VOL. 62

An empirical evaluation of the use of conditional and unconditional likelihoods for case-control data
Jay H Lubin
Biometrika | VOL. 68
Jay H LubinJay H Lubin
01 Jan 1981
Biometrika | VOL. 68

Kernel machine testing for risk prediction with stratified case cohort studies.
Rebecca Payne ... Matey Neykov
Biometrics | VOL. 72
Rebecca Payne, et. al.Rebecca Payne ... Matey Neykov
21 Dec 2015
Biometrics | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Secondary Analysis under Cohort Sampling Designs Using Conditional Likelihood

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Probability and Statistics