Flexible Models for Competing Risks and Weighted Analyses of Composite Endpoints

Đức Anh Nguyễn

doi:10.21954/ou.ro.0000ef99

Abstract

In many clinical studies the occurrence of different types of disease events over time is of interest. For example, in cardiovascular studies, disease events such as death, stroke or myocardial infarction are of interest. As another example, in central nervous system infections such as cryptococcal meningitis, unfavourable events such as death or neurological events and favourable events such as coma or fungal clearance are relevant. In statistical terminology, competing risks refer to data where the time and type of the first disease event are analysed. Such data arise naturally if a nonfatal disease event is of interest but is precluded by death in a substantial proportion of subjects. Competing risks are the topic of the first four chapters of this thesis. An alternative approach used in many randomized controlled clinical trials is to combine different harmful events to a single composite endpoint. The analysis of trials with a composite endpoints is the topic of the fifth chapter. This thesis is organised as follows: Chapters 1 and 2 are introductory chapters and provide an overview of statistical approaches to competing risks and semi-nonparametric (SNP) density estimation. Two concepts that form the basis for the work in Chapters 3 and 4 are introduced here: the cumulative incidence function (CIF) and SNP densities. For competing risks data, the CIF describes the absolute risk of different event types depending on time and is the most important quantity for data description, prognostic modelling, and medical decision making. SNP densities are densities that can be expressed as the product of a squared polynomial (of variable degree) and a base density which is chosen as the standard normal or the exponential density in this work. Chapter 3 presents a novel approach to CIF-estimation. The underlying statistical model is specified via a mixture factorization of the joint distribution of the event type and time and the time to event distributions conditional on the event type are modelled using SNP densities. One key strength of the approach is that it can handle arbitrary censoring and truncation. A stepwise forward algorithm for model estimation and adaptive selection of SNP polynomial degrees is presented, implemented in the statistical software R, evaluated in a sequence of simulation studies, and applied to data sets from clinical trials in central nervous system infections. The simulations demonstrate that the SNP approach frequently outperforms both parametric and nonparametric alternatives. They also support the use of “ad hoc” asymptotic inference to derive confidence intervals despite a lack of a formal mathematical verification for the relevant asymptotic properties. Chapter 4 extends the work of Chapter 3 to regression modelling, i.e. the quantification of cov-ariate effects on the CIF. A careful discussion of interpretational and identifiability issues which are intrinsic to models based on the mixture factorization is provided and the usage of the model is only recommended in settings with sufficient follow-up relative to the timing of the events. A simulation study demonstrates that the proposed approach is competitive compared to common statistical models for competing risks in terms of accuracy of parameter estimates and predictions. However, it also shows that “ad hoc” asymptotic inference is only valid if sample size is large. The chapter also provides a suggestion for model diagnostics of the proposed model, an area that has been somewhat neglected for competing risks data. Chapter 5 discusses the analysis of composite endpoints. A common critique of traditional analyses of composite endpoints is that all disease events are equally weighted whereas their clinical relevance may differ substantially. This chapter addresses this by introducing a framework for the weighted analysis of composite endpoints that handles both binary and time-to-event data. To address the difficulty in selecting an exact set of weights, it proposes a method for constructing simultaneous confidence intervals and tests that protect the familywise type I error in the strong sense across families of weights which satisfy flexible inequality and order constraints based on the theory of χ-2-distributions. It is then demonstrated in several simulation scenarios as well as applications that the proposed method achieves the nominal simultaneous overall coverage rate with lower efficiency loss compared to the standard Scheffe’s procedure. Final remarks are given in Chapter 6 together with an outlook for potential future research directions.

Full Text