The Authors' Reply: In our study, “Confounding in the Association of Proton Pump Inhibitor Use With Risk of Community-Acquired Pneumonia,” we found that use of PPIs was associated not only with higher rates of community-acquired pneumonia, but also higher rates of other infectious and non-infectious conditions such as urinary tract infection and deep venous thrombosis. We interpreted these multiple associations as demonstrating selection bias among those using PPIs rather than a causal effect of PPI use on pneumonia, as has been argued extensively. Rysavy and colleagues raise the important objection that we did not report whether PPI use preceded a diagnosis of pneumonia, which is important since a temporal association is a pre-requisite for any causal interpretation of an empirical association. While their point is valid, several points are worth noting. First, our estimated odds ratios of the association between PPI use and pneumonia were similar to those reported in previous studies, suggesting this timing issue may be more important in theory than in practice. Second, in unreported analysis, we restricted cases of pneumonia and other conditions among PPI users to only include those that occurred after a prescription for a PPI had been filled, which is similar to what Rysavy and colleagues suggest. Doing so did not affect our basic finding that PPI use is associated with multiple disease diagnoses even within the same individual compared over time during periods of PPI use and non-use. Third, the demonstration in other studies that PPI use is associated with higher temporally subsequent rates of pneumonia is not convincing alone. Individuals change over time and trends in disease states or social factors that lead an individual to be prescribed a PPI may also be correlated with future risk of pneumonia. This association may not reflect a causal impact of PPI use on pneumonia, but rather trends in health or provider behavior that are unobserved by the analyst and that are correlated with both PPI use and subsequent pneumonia risk. Norris eloquently summarizes the intuition behind our falsification approach, and, with some caveats, we agree with the general principle he articulates. The main thrust behind Norris’s argument is that falsification endpoints such as urinary tract infection, skin infection, osteoarthritis, chest pain, and so on, cannot be chosen arbitrarily. Indeed, demonstrating that PPI use is not associated with a completely arbitrary, randomly chosen variable (e.g., hair color) would not be an appropriate falsification endpoint since there is no known causal mechanism by which hair color might affect both the risk of CAP to a patient and his probability of being prescribed a PPI. A falsification hypothesis is only useful if it assesses a specific mechanism of confounding, in our case, selection on the basis of unobserved health risks, patient socioeconomic characteristics, or physician attributes, all of which may be associated with both PPI use and risk of CAP. Falsification tests that do not test a specific mechanism of confounding offer little, since they cannot suggest whether selection on unobserved variables is likely to be causing the observed association between a treatment (PPI use) and an outcome of interest (CAP). It is reasonable to question whether osteoarthritis and chest pain would be appropriate falsification outcomes given this discussion. While osteoarthritis is a chronic condition, whereas CAP is acute, identifying an association between PPI use and osteoarthritis tests a specific mechanism of confounding, patients who are more likely to see a physician and be diagnosed with a condition such as an osteoarthritis flare are also more likely to be prescribed a PPI (as well as other medications) and diagnosed with CAP (as well as other conditions). Our opinion is that this access to care (or more generally, access to a diagnosis) may be an important source of confounding beyond the pure physiologic effect of health risks on the risk of CAP. We agree with Norris that chest pain may be endogenous in that a diagnosis of chest pain may prompt an empiric treatment with a PPI. It would therefore not be an ideal falsification test since the association between chest pain and PPI use may not reflect a specific method of confounding. With that said, we find positive falsification tests with a series of additional outcomes including deep venous thrombosis, urinary tract infection, and cellulitis. We agree with both sets of authors that falsification analyses (or ‘specificity criterion’) are but one tool to help determine causality in observational studies. We view falsification testing not as a substitute for rigorous observational study designs, but as a necessary complement. More broadly, observational studies in medicine should more frequently attempt to search for plausibly exogenous sources of variation in treatment (e.g., natural experiments) to help identify causal relationships. For instance, one may consider using variation in health plan coverage of particular medications as a source of treatment ‘randomization.’
Read full abstract