Abstract

We interpret methods development in causal inference as a process of constructing tools for answering internal questions (following Carnap1). Here, “internal” means that these questions can only be understood with respect to an ontology (otherwise their meaning is unclear). Specifically, we argue that epidemiologists will be aided by an understanding of ontologies when faced with the difficult tasks of estimand selection and statistical model specification. To illustrate our points, we refer to the article by Davis-Plourde et al.2 concerning causal inference methods in dementia research. WHAT IS ONTOLOGY? The term ontology derives from philosophy and concerns the study of the existence of things.3 More recently, the term ontology has been used in computer science, linguistics, and artificial intelligence to describe a formal set of (representational) terms that defines a “universe of discourse.”4 Specifying such sets of terms and their relations facilitated computational reasoning and the translation of information within and across systems. Epidemiology, since its origins, is a discipline that “counts”5 and computes, and that critically engages in translation of scientific evidence between an array of social and scientific disciplines. However, the term ontology is rarely used in epidemiology (for some exceptions, see Haendel et al.6 and also Muntaner and Dunn7), among others.8,9 Here we use the term extensively to denote the formal causal and statistical frameworks, comprised of definitions and axioms that are routinely exploited (but often only implicitly referenced) in epidemiologic data analysis. While more familiar terms, like “framework” or “model,” might have served a similar purpose, we choose the term ontology, along the lines of Robins and Richardson10 and Lauritzen and Richardson,11 precisely because of its philosophical connotations. Choices concerning “frameworks” and “models” (henceforth, ontologies) are indeed philosophical in nature and may defy empirical interrogation. Nevertheless, such choices are crucial for unambiguously linking data, numbers, and statistical procedures to the plain language formulations of epidemiologic theories and research questions.12,13 We argue that increased attention to causal and statistical ontologies, and their relations with epidemiologic theory, will aid in the tasks of estimand selection, model specification, and identification, which are important steps in any epidemiologic inquiry. ONTOLOGIES IN CAUSAL INFERENCE One example of a specific causal ontology is the Finest Fully Randomized Causally Interpretable Structured Tree Graph (FFRCISTG) articulated by Robins.12 The FFRCISTG defines counterfactuals for individuals as the fixed and deterministic outcomes that would occur in a hypothetical controlled study.12 However, the FFRCISTG is just one of the many different causal ontologies, each with different implications for the existence, meaning, and interpretation of classical objects for causal inference. For example, “agnostic” causal ontologies make no reference to counterfactuals whatsoever.12,14–16 Other ontologies posit the existence of counterfactuals that, rather than being fixed, are inherently stochastic.17 Commitment to one causal ontology or another sometimes has minimal practical implications for discourse, but other times that impact is profound.10 For example, consider Investigators A, B, and C who commit to the following causal ontologies, respectively: an FFRCISTG, an FFRCISTG with stochastic counterfactuals, and an agnostic causal ontology. Further, consider two parameters that make reference to a two-arm trial comparing treatment versus placebo: (1) the difference in expected outcomes between treatment and control [i.e., the average causal effect (ACE)] and (2) that same difference among patients who would have survived in both arms of the trial [i.e., the survivor average causal effect (SACE)]. All three investigators would have no issues discussing the ACE; although they might use different symbols, the distinguishing ontological elements (i.e., the existence of counterfactuals and their stochastic nature) are extraneous to the parameter in question. However, discourse around the SACE would fail. For investigator C, the conditioning event, defined in terms of counterfactuals, is not well-defined14 (see the accompanying commentary by Robins and Greenland, 2000). For investigator B, the counterfactuals that define the SACE exist but not the SACE parameter itself; when counterfactuals are stochastic, the conditioning event in the SACE parameter does not denote a stable subpopulation. For investigator A, the SACE parameter is well-defined. For epidemiologic inquiries, causal ontologies will often be supplemented with statistical ontologies, which provide meaning to common parameters (e.g., expectations) and standard statistical objects (e.g., confidence intervals). In some statistical ontologies, causal parameters (such as expectations of counterfactual outcomes) are (arithmetic) means that occur in near-infinitely large superpopulations, while in other ontologies, these are means in the study sample itself. In superpopulation frameworks with deterministic counterfactuals, statistical uncertainty arises from sampling variability, and statistical inference is based on hypothetical resampling from the superpopulation. In design-based frameworks, parameters are defined in terms of the sample: uncertainty arises primarily from missing data (untreated patients’ counterfactual outcomes under treatment are unobserved and vice versa), and statistical inference is based on hypothetical re-randomizations of treatment to the study sample.18 Choices concerning statistical ontology not only imply different computational procedures but also different interpretations of common causal parameters that may be relevant for public health decision-making. For example, a hypothesis test about an “average causal effect” may have more or less relevance to a policymaker concerned with the health of a future population, depending on whether a superpopulation or design-based statistical ontology was deployed, especially when such tests would not reject under one and reject under another. The interaction between causal and statistical ontologies is also important. For example, the properties of design-based inferential tools will depend on whether or not counterfactuals are posited to be deterministic or stochastic. See also Robins19 and Abadie et al.20 for extended discussions of ontologic differences between superpopulation and design-based statistical frameworks and their implications for interpretation and statistical inference. CHALLENGES FOR ESTIMAND SELECTION An epidemiologic data analyses should be linked to a well-defined epidemiologic task formulated in a plain language, such as a policy decision or interrogation of epidemiologic theory.5 In contrast, the aim of a data analysis is to learn about a statistical object formulated in technical symbols. Statistical and causal ontologies define a set of statistical objects and provide a bridge to epidemiologic tasks by mapping statistical objects to plain language definitions. Equipped with an ontology and faced with an epidemiologic task, an investigator can then select the most appropriate estimand from the set of well-defined statistical objects. However, even after commitment to an ontology, selecting the most appropriate estimand can be challenging. For example, the stated task of Davis-Plourde et al.2 is to identify “determinants” A of outcome Y. This aim will often motivate a sharp null hypothesis, which can be formulated under a deterministic counterfactual causal ontology as Ya=1=Ya=0 for all members of the population. However, the outcome by Davis-Plourde et al. was cognitive decline, Y, which is undefined for a patient who is not alive, D = 1. Parameters concerning the effects of treatment on such an outcome will only have meaning conditional on survival, D = 0. Alternatively, cognitive function might be defined to take a value of 0 among those who are dead. This indicates an ontology that treats death as a competing event (as opposed to a truncating event),21 but we believe that most subject matter experts would consider such an ontology to be inadequate in this setting. When a treatment affects intermediate survival D, naive conditioning on survival will generally lead to invalid inference. This phenomenon is often referred to as selection bias in the epidemiologic literature22,23 but may also be understood as an example of a type III error24: an error arising from selecting the wrong estimand. For example, suppose that an investigator selects the causal estimand E[Ya=1|Da=1=0]−E[Ya=0|Da=0=0]>0, which is readily identified in a perfectly executed trial without additional assumptions. However, this well-defined parameter cannot in general be used as a valid test of a sharp null hypothesis; the conditioning events of the two terms in the difference (Da=1 and Da=0, respectively) possibly describe different patients, and so the estimand is not an average causal effect. Similar issues plague the interpretation of hazard ratios12,25,26 for survival outcomes Dt, Pr[Dta=1=1| Dt−1a=1=0]Pr[Dta=0=1| Dt−1a=0=0], where here and henceforth we assume a time grid as fine as the data, indexed by t, and with maximum time K. To make progress, some authors select estimands that are guaranteed to be average causal effects in a subset of the population with well-defined outcomes. For example, the SACE, E[Ya=1−Ya=0|Da=1=Da=0=0], conditions on the “always survivors,” a subset of the population that survives under both treatment levels a = 1 and a = 0.12,27 Tests of such parameters can be used to test the sharp null hypothesis. For example, E[Ya=1−Ya=0|V=v]≠0 implies that Ya=1≠ Ya=0 for at least one person with arbitrary variable V = v. However, such tests may not be consistent because E[Ya=1−Ya=0|V=v]=0 (the null hypothesis) does not imply the absence of any individual-level effect (the sharp null hypothesis). Under the (nonsharp) null hypothesis, patients harmed by an exposure may be perfectly balanced by patients who benefit or individuals affected by treatment may fall out of the conditioning set. A Peculiarity in the Estimand of Davis-Plourde et al. Davis-Plourde et al2 do not explicitly name their causal ontology. However, we infer from their estimand and notation that they commit to a causal ontology with deterministic counterfactuals (along the lines of the FFRCISTG of Robins12), and so we consider their estimand from this perspective. Davis-Plourde et al. identify their estimand as the “subject-specific partly conditional expectation,” which they define as the parameter E[ΔYi, ta=1|Tia=1=Tia=0>t,Li=l,bia=1]−E[ΔYi, ta=0|Tia=1 =Tia=0>t,Li=l,bia=0] for some time t. Davis-Plourde et al. later assume the “subject-specific random effects” bi are unaffected by treatment so that the equivalence bia=0= bia=1 = bi always holds for all i. Then we reexpress their estimand as E[ΔYi,ta=1−ΔYi,ta=0|Tia=1=Tia=0>t,Li=l,bi], which we recognize as a particular conditional SACE. It conditions not only on measured covariates Li =li and unmeasured latent “random effect” bi, but also on the patients whose continuous survival times would be identical under both exposure levels, reflecting a conditioning set for whom exposure has no effect on death at any time point. The estimand by Davis-Plourde et al. appears similar to a conventional SACE: it ensures that the variable ΔYi,ta=1 − ΔYi,ta=0 is well-defined for all individuals in the conditioning set. However, it also exhibits some additional peculiarities, which we now consider. Under a generalized consistency assumption, we can reexpress the counterfactuals defining the principal stratum as Tia = Tia,ΔYta. That is, the patient’s counterfactual survival time upon setting exposure to a is equal to that survival time setting exposure to a and setting their cognitive change at t to the level it would take upon setting exposure to a, ΔYta. Thus, a principal stratum defined by Tia=1 = Tia=0 > t could be equivalently written as Tia=1, ΔYta=1= Tia=0, ΔYta=0 >t. Heterogeneity in the effect of exposure on cognitive decline means that for some individuals, ΔYta=1 − ΔYta=0 will be large and positive, and for others, ΔYta=1 − ΔYta=0 will be small and close to 0. But we have seen that the estimand by Davis-Plourde et al. is defined by individuals whose counterfactual survival time under a = 0 and ΔYta=0 is the exact same as their counterfactual survival time under a = 1 and ΔYta=1. In expressing counterfactuals partly defined by interventions on cognitive decline, it is more easily seen that this estimand studies a subgroup comprised of individuals with one of the two following characteristics: (1) their exposure causes changes in cognitive decline ΔYta=1 − ΔYta=0 that are so small as to be irrelevant for their survival or (2) their survival is robust to moderate or large changes in cognition ΔYta=1 − ΔYta=0. But then the parameter by Davis-Plourde et al. is precisely the expectation of these differences in cognitive decline, ΔYta=1 − ΔYta=0. If the estimand by Davis-Plourde et al. is large, it must be because of group 2, for whom large changes in cognitive change do not affect survival at any time point, which arguably is an eccentric subset of the population. If we make the simplifying (albeit not unreasonable) assumption that a change in cognition has some (even infinitesimal) effect on time to death for all patients, regardless of their treatment status, i.e., P(Ta,Δyt≠Ta,Δy't)=1 for all a, t, yt≠y't then group 2 will not exist and the peculiar SACE [Expression (1)] is guaranteed to be equal to 0. It is possible that Davis-Plourde et al. meant to target the estimand E[ΔYi, ta=1−ΔYi, ta=0|Tia=1>t,Tia=0>t,Li=l,bi], which conditions on patients who would survive up to time t under both treatment groups. This parameter avoids the peculiarities we highlight, while still ensuring that a treatment effect on cognition is well-defined for all patients in the conditioning set. However, while a test of any SACE might sometimes be used as a valid test of the sharp null hypothesis, its utility in this context is limited. Regardless of whether the estimand of Expression (1) or (2) is selected, the “always survivors” may represent an eccentric or very small (possibly nonexistent) subpopulation whose members are not generally identifiable from data.28–30 Careful attention to the causal ontologies that define a SACE helps illuminate these features. When investigators are concerned not merely with causal identification, but with policy decisions for individual patients or observable groups of patients, they may consider other estimands. Elaborated Ontologies and Alternative Estimands for Truncated Outcomes Rather than reflecting the limitations of existing estimands, the challenges of ill-defined outcomes under truncating events may reflect conceptual limitations of popular causal ontologies. With this in mind, Robins and Richardson10 elaborate an ontology31–34 in which a treatment A could be decomposed into an element AY that affects the outcome completely unmediated by survival and an element AD that affects the outcome solely via its effects on survival. Under this elaborated ontology, an investigator can test whether the AY component of treatment had some effect on Y among survivors. Further, if AD and AY might (possibly in the future) be separably manipulable, an investigator might be interested in the expected potential outcome under an intervention that solely assigned the AY component and omitted the AD component, E[YaY=1−YaY=0|DaD=0=0]. This parameter is the so-called conditional separable effect,35 which can be empirically validated in a future (four-arm) trial where AY and AD are separately assigned. We emphasize that the aim of this article is not to advocate separable effects over principal strata effects in all possible settings; our message is that (elaborated) causal ontologies can allow us to consider and justify new estimands, which can be useful in certain settings. MODEL SPECIFICATION AND IDENTIFICATION In the preceding section, we demonstrated how ontological commitment (whether implicit or explicit) is necessary for selection of an estimand. When we carefully attend to causal ontologies, we are better equipped to analyze the relevance of different estimands for different public health tasks and to elaborate existing ontologies to consider novel targets of inference. In this section, we discuss how attention to ontology also facilitates a task distinct from estimand selection, which is model specification and identification. We argue that we gain additional insight into the importance of model specification and identification when we are explicit about ontologies. Models Link Interpretation of Different Parameters Once we have committed to an ontology, and selected an estimand, we will most likely be confronted with the “fundamental problem of causal inference”36: that is, that causal parameters are not directly observed. When the definition of a causal parameter makes no reference to any factual parameters, we can know nothing about them without something more than our ontology. That “something more” is a model. What then is a model and how is it distinct from an ontology? A model can be understood as a set of restrictions on the space of values that ontologically defined parameters can possibly take. Restrictions are achieved via assumptions. Different investigators may share the same ontological commitments about the existence of various objects (e.g., counterfactual random variables and the definitions of probability distributions) but may make different assumptions about their possible values and interrelations. Assumptions purportedly encode substantive knowledge or belief. For example, we often justify “no unmeasured confounding” assumptions by substantive arguments. We recognize that the ontologic status of some familiar concepts in causal inference is sometimes grounds for dispute (e.g., the causal consistency condition must be assumed in some causal ontologies), but for others, it will be a property that follows by definition.37–39 However, these debates are easily resolved when investigators understand the ontologies they work within, and then the features that constitute a model become clear. A model (understood as a set of restrictions) provides additional architecture to a parameter space under which two parameters may become linked. For example, with binary treatment and outcome, a model solely comprised of the assumption that Y = YA (otherwise known as consistency for outcome Y with respect to exposure A) implies bounds on the ACE, E[Ya=1−Ya=0], upon knowledge of the joint distribution of A and Y. When the two parameters always coincide under a model, we will often “interpret” the observed parameter using the definition of the counterfactual parameter. Identification is the process of deriving these coincidences under a model.40 To illustrate the importance of identification in epidemiology, consider the following observed data parameter, −∑l¯KE(Y|DK+1=0, L¯K=l¯K,A=1)fL¯1, DK+1|L0, A(l¯1, 0|l0,0)fL0(l0)P(DK+1=0|A=0). −∑l¯KE(Y|DK+1=0, L¯K=l¯K,A=1)fL¯1, DK+1|L0, A(l¯1, 0|l0,0)fL0(l0)P(DK+1=0|A=0). It is challenging to assert any relevance of this parameter to a nonstatistical collaborator, if described in literal terms; it is just a functional of conditional expectations of observed variables. However, under a specified model that minimally includes a set of causal assumptions presented in Stensrud et al.,35 and an ontology that defines the existence of component variables AY and AD, this observed data parameter always coincides with the causal parameter termed the conditional separable effect E[YaY=1,aD=0−YaY=0,aD=0|DK+1aD=0=0]. Under still further structural conditions, it also always coincides with a SACE, E[Ya=1−Ya=0|DK+1a=1=DK+1a=0=0].35 The coincidence between the SACE and the separable effects illustrates that interpretations of estimable observed data parameters are model dependent. Identification is thus not only a mathematical exercise but a semantic one. If we view science and decision-making as an inherently social phenomenon where meaning is central, then the careful analysis involved in identification is critically important; this analysis clarifies the meaning of the numbers we compute. Issues With the Proposal of Davis-Plourde et al. As discussed in the Section Challenges for Estimand Selection, Davis-Plourde et al. focus on the subject-specific partly conditional expectation of Expression (1) [or possibly Expression (2)], and we suppose they operate in a causal ontology with deterministic counterfactuals. Their stated aim is to evaluate “Joint Modeling” as a “strategy to address survival bias.” In terms of the concepts discussed so far, we interpret this as an aim to evaluate whether the observed data parameter that is estimated by the joint modeling strategy identifies Expression (1) under a model that assumes the presence of an unobserved common cause of survival and the outcome of interest. For brevity, while Davis-Plourde et al. further restrict their model with (strong) parametric assumptions, we will refer to their model of interest as a “Collider-Bias Model.” Davis-Plourde et al. generated 1000 factual data samples (each of n = 2000, including both observed and unobserved variables for each simulated individual) using Monte-Carlo simulation methods under several mechanisms included in the Collider-Bias Model. For each of these mechanisms, the structural functions for generating cognitive outcomes at each time t excluded the exposure. We emphasize that the simulated data were factual; counterfactual data were not generated. Davis-Plourde et al. then implemented a joint modeling algorithm, computing a term β ^1,1 in each data set, and then reported empirical averages and SDs of β ^1,1 across all 1000 data sets. Davis-Plourde et al. found that these averages were approximately 0 for some data-generating mechanisms in the Collider-Bias Model, but for most of the data-generating mechanisms specified for the simulation, the averages departed from 0 by more than 1 SD. For all data-generating mechanisms, the averages for the joint modeling algorithm were closer to 0 than those produced by procedures Davis-Plourde et al. claim were more prevalently used in the literature under Collider-Bias Models. Davis-Plourde et al. conclude that joint modeling approaches “are not a panacea, but they appear to offer important advantages over other popular methods,” and Davis-Plourde et al. recommend use of joint modeling approaches under a Collider-Bias Model over existing alternatives. We offer some criticism to this conclusion for three main reasons. First, we question the validity of their simulation methods for evaluating the unbiasedness of an estimator for a causal parameter because Davis-Plourde et al. do not actually generate any counterfactual data. If Davis-Plourde et al. had specified a data-generating method to simulate joint factual and counterfactual variables, they could have used Monte-Carlo simulations to approximate a near-infinitely large superpopulation and directly computed the true value of their estimand, the subject-specific partly conditional expectation of Expression (1) [or Expression (2)], by taking the empirical average of ΔYi, ta=1−ΔYi, ta=0 among simulated individuals with Tia=1=Tia=0>t,Li=l,bi. Such a simulation is important, not only for clarity but because the structure of factual data-generating mechanisms under some causal ontologies (e.g., the FFRCISTG) need not imply any restrictions on the causal structure between counterfactual variables under a counterfactual regime (see Robins et al.31 and Sarvet et al.41 for realistic examples). We critique the approach by Davis-Plourde et al. under an FFRCISTG ontology to highlight that such implications may be ontologically dependent. In this case, generating both factual and counterfactual data might have illuminated that additional assumptions (or a more restrictive causal ontology) are necessary for identification of the causal estimand. Second, we suggest that formal arguments about identification of estimands under a model (usually via a theorem and a proof) are important when proposing a new method. The method by Davis-Plourde et al. involved applying an existing procedure under a new (Collider-Bias) model. There is active research on identification of counterfactual parameters in the presence of unmeasured confounders, or other biasing variables, using the first principles of a causal ontology,42–46 and some results are subject to considerable controversy.47 However, identification results were omitted by Davis-Plourde et al., and the coincidence between the observed data parameter consistently estimated by the joint modeling approach and counterfactual estimand by Davis-Plourde et al. under a Collider-Bias Model has, to our knowledge, not been examined in existing literature; previous works have considered other causal parameters under strong parametric assumptions,48,49 and other authors have discussed casual interpretations informally without a clear reference to a causal ontology.50,51 It is, therefore, unclear exactly under what additional assumptions such identification can be made given a Collider-Bias Model. Because simulations by Davis-Plourde et al. suggest bias for their estimand under some data-generating mechanisms, their results illustrate that these data-generating mechanisms cannot possibly be members of the models under which their estimand is identified by the joint modeling parameter. However, even if their simulation results did suggest that their estimand was consistently estimated for all data-generating mechanisms included in the simulations, we would have limited confidence in any conclusions about identification. Collider-Bias Models (as well as most reasonable statistical models) contain an infinite number of data-generating mechanisms, and the demonstration of unbiasedness under a finite subset of these mechanisms, which is all a simulation study can provide, should not be extrapolated to all data-generating mechanisms in the model. On the other hand, simulations can be used to falsify a claim about a model. For example, simulations by Davis-Plourde et al. actually demonstrate that the joint modeling estimator does not identify their causal parameter for all data-generating mechanisms under the Collider-Bias Model. Simulations are, therefore, limited in their capacity to contribute to the important task of identification and might only support, but not replace, mathematical derivations. Third, even if Davis-Plourde et al. had successfully proven that the joint modeling method consistently estimated their causal estimand under a general and realistic Collider-Bias Model, we question the claim that any principal stratum effect is directly useful for etiologic research or policy-making for reasons mentioned in the section Challenges for Estimand Selection and elaborated elsewhere.28–30,35 CONCLUSION “Without commitment to a paradigm there could be no normal science.” So said Kuhn52 in his influential essay on “The Structure of Scientific Revolutions” in the natural sciences. Epidemiology, however, is not a typical natural science. It is a uniquely interdisciplinary enterprise and surely benefits from this quality. We do not argue that we as epidemiologists should align with a single causal or statistical ontology. However, the very fact that causal and statistical ontologies will vary between individuals, over time, and across different subject matter issues, highlights a prevailing need for increased attention to the ontologies we adopt. Studying the intersection of, and coherence between, different ontologies originally designed for different applied problems (like linear mixed-effects frameworks and the FFRCISTG causal model) should be recognized as a well-defined and important area of epidemiologic research. When we have a clear understanding of our ontologies, statistical models, and estimands, we are better positioned to use data to promote a healthier and fairer society. ABOUT THE AUTHORS Aaron L. Sarvet is a postdoctoral research fellow in the Department of Mathematics at École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland with Mats J. Stensrud. He received his PhD in Population Health Sciences (Epidemiology) from the Harvard University. Mats J. Stensrud, MD, Dr Philos, is a tenure-track assistant professor of statistics in the Department of Mathematics, EPFL. His research focuses on methods for causal inference, usually in settings with exposures and outcomes that depend on time. Before coming to EPFL, Dr Stensrud was a Kolokotrones research fellow and Fulbright Research Scholar at the Harvard School of Public Health.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call