Abstract

The paper by Nguyen et al.1 published in this issue of Epidemiology presents a comparison of the recently suggested inverse odds ratio approach for addressing mediation and a more conventional Baron and Kenny-inspired method. Interestingly, the comparison is not done through a discussion of restrictiveness of implied assumptions, asymptotic properties, or simulations; instead, Nguyen et al.1 compare the results obtained by applying the two methods to a real-life mediation problem, which is scientifically interesting in its own right. We would like to applaud this choice as we believe it simultaneously ensures that the comparison is based on properties, which matter in actual applications, and makes the comparison accessible for a broader audience. In a wider context, the choice to stay close to real-life problems mirrors a general trend within the literature on mediation analysis namely to put more and more emphasis on ease of implementation, usability, and explanation; see, for instance, the SAS and SPSS macros by VanderWeele and Valeri2 and the natural effects models implemented in the accompanying R package medflex by Vansteelandt and colleagues.3–5 Nguyen et al.1 also include R-code in their publications, thereby shortening the road from reading their paper to employing the considered methods on one’s own data. In this commentary, we will try to follow up on these developments by providing a snapshot of how applied mediation analysis was actually conducted in 2015. While we do not expect to find applications using the inverse odds ration approach, as it simply has not had enough time to move from theoretical concept to published applied paper, we do expect to be able to judge the willingness of authors and journals to employ the causal inference-based approach to mediation analyses. Our hope is that the snapshot will serve to illuminate whether further studies like Nguyen et al.1 are needed or if blind spots have appeared in the methodological community. As we could not survey all journals within epidemiology, we instead chose to focus on the top five journals according to the 2014 Journal Citation Reports by Thomson Reuters. Accordingly, we surveyed the following journals: International Journal of Epidemiology, Epidemiologic Reviews, Epidemiology, European Journal of Epidemiology, and American Journal of Epidemiology. In addition, we would like to briefly examine how applied mediation analyses were conducted in more clinical journals. We therefore also included the Lancet family of journals and the New England Journal of Medicine. These journals were chosen because of their high impact and prestige. The commentary is structured as follows: First, we discuss insight we had hoped the applied communities have learned from causal inference-based mediation analysis. Second, we present the results of the review as well as the methodology employed. Finally, we provide our hopes for the future. WHAT HAVE CAUSAL INFERENCE ADDED TO MEDIATION ANALYSIS? Clearly the underlying thinking in any mediation analysis (i.e., which pathways bring about an observed cause–effect relation) vastly predates the modern causal inference based mediation analysis as well as the seminal 1986 paper by Baron and Kenny.6 To name just one example the search for the mechanisms linking exposure to contaminations with subsequent disease, which was ongoing already in the 16th century, culminating in Louis Pasteur’s identification of bacteria as the “mediating factor” was based on mediation thinking. Such intuitive concepts of mediation analysis lead to the first generation of statistical mediation analysis building on regressions and path analysis in general. This part of the literature is often associated with the Baron and Kenny paper. Only as the last step was the formal causal inference-based thinking (using concepts such as nested counterfactuals) applied to mediation analyses. Acknowledging this long history, what are then the novel contributions from the application of formal causal inference thinking to mediation analysis? As we see it, the contributions of causal inference to mediation analysis can be divided into two categories: (1) On the conceptual level, we have been able to move from a somewhat imprecise (perhaps intuition based) definition of what a mediated effect is to precise and mathematically well-founded definitions such as natural direct and indirect effects. Recall for instance that the natural indirect effect is the change we would observe on a given outcome if we could change each person’s mediator to the value it would naturally take when the exposure was intervened on, but without actually changing the exposure from reference level; see the recent book by VanderWeele7 for an in-depth discussion. Causal inference thinking has also allowed us to state precisely which assumptions must be satisfied to allow the estimation of mediated effects. (2) On a more practical level, the causal inference-based approach to mediation analysis has led to the development of statistical tools that can perform mediation analysis on most data types and in most settings, even when complicating interactions are present. This development has greatly expanded the mediation toolbox from the original “difference of coefficients” and “product of coefficient” methods.8 The following examples illustrate the width of the proposed methods: how to conduct mediation analysis on survival outcomes,9 multiply robust approaches,10 and generic methods allowing almost all data types and interactions.3,4,11 Some of the proposed methods contain only the mathematical details, others also included code-snips, and a few have full accompanying software implementations. The single example considered in Nguyen et al.1 shows that the advanced estimation tool, in this case the inverse odds ratio method, does not necessarily lead to different estimates or smaller standard errors compared to the traditional Baron and Kenny method. It would be interesting to see if this observation was a general feature, which could be achieved either by a simulation study or by conducting more comparisons on real scientific problems and data sets. The discussion section of Nguyen et al.1 demonstrates that the theoretical understanding of both assumptions and the resulting estimates is necessary for a correct interpretation of the mediation analysis and the underlying assumptions. In short: in the example in Nguyen et al.,1 part 2 above was not needed, but part 1 was. In the following, we will broaden this analysis to a snapshot of published applied mediation analyses in 2015. A SNAPSHOT OF PUBLISHED MEDIATION ANALYSES Methodology Using Google Scholar, we identified all papers published in 2015 containing the keywords “mediation,” “mediated,” or “effect decomposition” published in International Journal of Epidemiology (IJE, 59 papers identified), Epidemiologic Reviews (0 papers), Epidemiology (Epi, 32 papers), European Journal of Epidemiology (EJE, 24 papers), American Journal of Epidemiology (AJE, 47 papers), Lancet family of journals (146 papers), and New England Journal of Medicine (NEJM, 36 papers). We conducted the search on the 30 March, 2016. As the keywords are by no means restricted to statistical mediation analysis, an initial screening removed all papers not containing an applied mediation analysis. This entailed, for instance, the exclusion of papers on mediation between victim and aggressor. We also removed purely methodologic papers and commentaries/letters. After the initial screening, 20 papers were left each presenting original applied mediation analyses. The goal of the snapshot analysis was (a) to present an overview of the applied mediation analyses being published within 1 year and (b) to assess the degree to which insight from formal causal inference based-mediation analysis is being employed. Point (a) is mainly achieved through a brief presentation of each paper (see Table). In point (b), we wish to assess the degree to which parts 1 and 2 of the preceding section are used in the surveyed papers. This is operationalized by assessing three criterions: first, if the mediation analysis includes a discussion of confounding of the mediator–outcome relationship (column C in Table); second, if the paper includes a discussion of required identifiability conditions (e.g., nonintertwined causal pathways for natural direct and indirect effects) for mediation analyses (column A in Table); third, if the paper uses a formal interpretation of the indirect effects (e.g., the interpretation of natural indirect effect provided in our point 2 above) or a more intuitive interpretation focusing on the relative importance of the mediator. The following quote24 illustrates what we mean by an intuitive interpretation: “The results of our mediation analysis suggested that IQ was responsible for 72% of the effect on income”). The last is reported in column I. It must be stressed that the review by no means is an assessment of the scientific value of the mentioned papers; it is only a review of the methodology employed.TABLE: Overview of Papers Presenting Novel Statistical Mediation Analyses in 2015RESULTS From the snapshot analysis summarized in the Table, it can be concluded that estimation of direct and indirect effect is often done using simpler methods such as difference of coefficients or product of coefficients. However, causal inference-based estimation techniques and structural equations are also widely used. While we have not statistically assessed the question, we observe a tendency for causal inference-based methods to be more widely used in Epidemiology compared with the other surveyed journals. The formal understanding of mediation analysis is least pronounced among the clinical journals (Lancet family and NEJM). A positive observation is that confounding of the mediator–outcome relationship, which has previously received too little attention in applied mediation analyses, is almost always addressed. From a subject matter perspective, it is observed that the identified mediation analyses span a broad range of topics with no single topic being notably pronounced. DISCUSSION In this commentary, we wished to assess the degree to which (a) structural understanding and (b) statistical tools developed within the causal inference-based research on mediation analysis have made it into applied mediation analyses. The snapshot review shows that many aspects of the recent decades’ methodologic developments have indeed been adapted in the applied community at large. In particular, the need for addressing confounding of the mediator–outcome relation has been fully taken up. The most prominent conclusion, however, is that mediation analyses are often interpreted in an intuitive fashion. In other words, the formal definition of the measure of mediation (e.g., natural effects) is not important in the interpretations of the surveyed papers. We therefore conclude that the snapshot review underpins the need for and importance of the work in Nguyen et al.1 Such work provides hands-on advice on the choice of methods and is simultaneously a soft introduction to the precise definitions and interpretations of mediation analysis. The snapshot review also poses questions to the methodological community working on mediation analysis. In particular, one must wonder if we have failed in communicating the precise interpretations of the different measures within mediation analyses (e.g., natural effects). It gives food for thought that almost none of the papers reviewed in the snapshot were really interested in the exact estimate of, for example, the indirect effect; instead, attention was on a classification of a given mediator on the scale from not important to full mediation. Adopting the position that the applied mediation analyses have actually used the scientifically most appropriate techniques and interpretation (i.e., this is not a lack of knowledge, but a deliberate choice) leads to a challenging question for the methodologic community namely: Does answering the scientific questions requiring statistical mediation analyses need different parameters than what the community has provided? As statistical mediation analysis is more widely employed, it will be interesting to follow how the new perspectives coming from the applied side will feed back into methodological development. ABOUT THE AUTHORS THEIS LANGE is Associate Professor in biostatistics at the University of Copenhagen and currently visiting professor at the Center for Statistical Science, Peking University. He has contributed to causal inference and epidemiology through his work on mediation analysis with survival data, for which we was awarded the Kenneth Rothman Award, and through the introduction of Natural Effects Models. He has been involved in numerous applied mediation studies spanning from register based epidemiology to clinical studies. He is currently developing mediation analysis tools to handle data where death can happen during the time period in which the mediator is determined; such as is the case in intensive care units. LIIS STARKOPF holds a master’s degree in mathematical statistics from University of Copenhagen. She is currently a graduate student in biostatistics at the University of Copenhagen. ACKNOWLEDGMENTS T. Lange gratefully acknowledges support from the Dynamical Systems Interdisciplinary Network, University of Copenhagen and the Copenhagen Infant mental Health Project. Both authors gratefully acknowledge support from the Centre for Research in Intensive Care.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call