Abstract

In late 2020, messenger RNA (mRNA) covid-19 vaccines gained emergency authorisation on the back of clinical trials reporting vaccine efficacy of around 95%,1, 2 kicking off mass vaccination campaigns around the world. Within 6 months, observational studies reporting vaccine effectiveness in the “real world” at above 90%, similar to trial results,3-6 became the trusted source of evidence upholding these campaigns. While the contemporary conversation about vaccine effectiveness has turned to waning protection, virus variants, and boosters, there has (with rare exception7) been surprisingly little discussion of the limitations of the methodologies of these early observational studies. The lack of critical discussion is notable, for even highly effective vaccinations could only partially explain the drop in rates of covid-19 cases, hospitalisations, and deaths by mid-2021. For example, by March 2021, cases in the UK and United States had dropped roughly fourfold from the January peak, when the “fully vaccinated” population only reached 20% and 5%, respectively. At the same time, in Israel, cases took longer to drop despite a substantially faster vaccine rollout (Figure 1). The vaccination campaigns in these countries can thus only be part of the story. We are aware of only one article that addresses methodological concerns in non-randomised studies of covid-19 vaccines.7 The author draws attention to potential biases and measurement issues, such as vaccination status misclassification, exposure differences, testing differences, attribution issues, and disease risk factor confounding. Many of these concerns are hard to confirm within specific studies due to data unavailability (e.g., testing differences) or cannot be fixed analytically (e.g., exposure and other unmeasured quantities). In this article, we focus on three major sources of bias for which there is sufficient data to verify their existence, and show how they could substantially affect vaccine effectiveness estimates using observational study designs—particularly retrospective studies of large population samples using administrative data wherein researchers link vaccinations and cases to demographics and medical history. Using the information on how cases were counted in observational studies, and published datasets on the dynamics and demographic breakdown of vaccine administration and background infections, we illustrate how three factors generate residual biases in observational studies large enough to render a hypothetical inefficacious vaccine (i.e., of 0% efficacy) as 50%–70% effective. To be clear, our findings should not be taken to imply that mRNA covid-19 vaccines have zero efficacy. Rather, we use the 0% case so as to avoid the need to make any arbitrary judgements of true vaccine efficacy across various levels of granularity (different subgroups, different time periods, etc.), which is unavoidable when analysing any non-zero level of efficacy. It is also important to note that under hypothetical conditions different from the actual events of early 2021, two of these sources of bias could bias results in the opposite direction, that is, underestimating actual vaccine effectiveness. Finally, to draw more precise conclusions about the impact of these biases on specific published studies, we urge that all code and data available to those studies be made public. In each of our three illustrations, we compare results based on observational study methods against randomised controlled trial (RCT) methods. For each comparison, one side represents a published study while the other is a counterfactual. In each case, we show how the gap between observational and RCT study results is due to a source of bias. The pivotal covid-19 vaccine trials used a primary endpoint of lab-confirmed, symptomatic covid-19.8-11 Not all covid cases, however, factored into the estimate of vaccine efficacy. Investigators did not begin counting cases until participants were at least 14 days (7 days for Pfizer) past completion of the dosing regimen, a timepoint public health officials subsequently termed “fully vaccinated.”12 The rationale for excluding cases occurring before the start of this “case-counting window” was not provided in trial protocols–and legitimacy of excluding post-randomisation events has long been debated13—however, one Pfizer post-marketing document states that in the early period post-vaccination, “the vaccine has not had sufficient time to stimulate the immune system.”14 In randomised trials, applying the “fully vaccinated” case counting window to both vaccine and placebo arms is easy. But in cohort studies, the case-counting window is only applied to the vaccinated group. Because unvaccinated people do not take placebo shots, counting 14 days after the second shot is simply inoperable. This asymmetry, in which the case-counting window nullifies cases in the vaccinated group but not in the unvaccinated group, biases estimates. As a result, a completely ineffective vaccine can appear substantially effective—48% effective in the example shown in Table 1. (The placebo data in Table 1 comes from the Pfizer Phase III randomised trial, and is the assumed case counts for the unvaccinated group in a counterfactual observational study occurring simultaneously; this setup illustrates the potential size of a case-counting window bias in a real-world setting as well as why this bias does not exist in a randomised trial.). We are aware of just one observational study3 that addressed case-counting window bias, by using matching and designating a pseudo-study enrolment date for the unvaccinated party in each matched pair of vaccinated and unvaccinated persons. While matching mitigates case-counting window bias, this method injects an artificial and severe age bias between unvaccinated and vaccinated groups: the matched subset underrepresented patients ≥ 70 years by 50% while over-representing patients ≤ 40 years by 50%. (This occurred because the propensity to receive the vaccine is highly influenced by age. Therefore, the number of one-to-one matched pairs of elderly patients is upper bounded by the number of unvaccinated elderly while the number of one-to-one matched pairs of younger patients is upper bounded by the number of vaccinated young.). In retrospective studies using large population samples, we propose a simple adjustment that can correct for case-counting window bias. The case rate from vaccination to the start of the case-counting window can be observed from the vaccinated group and applied to the unvaccinated group to estimate the number of cases to be excluded before computing the relative ratio of cases. This adjustment preserves the case-counting window, while assuming the vaccine is completely ineffective before its start. Because we use the 0% efficacy assumption, this simple adjustment returns the vaccine effectiveness estimate back to zero. A similar strategy has proved useful in influenza treatment analyses.16 Age is perhaps the most influential risk factor in medicine, affecting nearly every health outcome. Thus, great care must be taken in studies comparing vaccinated and unvaccinated to ensure that the groups are balanced by age. Failure to do so may lead to inaccurate estimates of vaccine effectiveness when the difference in outcomes can be explained, at least partially, by age bias. In trials, randomisation helps ensure statistically identical age distributions in vaccinated and unvaccinated groups, so that the average vaccine efficacy estimate is unbiased, even if vaccine efficacy and/or infection rates differ across age groups (see Figure 2A). However, unlike trials, in real life, vaccination status is not randomly assigned (see Figure 2B). While vaccination rates are high in many countries, the vaccinated remain, on average, older and less healthy than the unvaccinated because vaccines were prioritised for those older and at higher risk. Individuals also self-select for vaccination regardless of policy. Because covid-19 related risks (of infection, disease, and complications) also vary by age, this can confound the estimate of vaccine effectiveness. To illustrate this, consider the REACT-1 study.18 This study conducts PCR testing for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) on a random sample of England's population once a month. In June–July 2021 (the most recent data available), SARS-CoV-2 positivity rates varied considerably by age (from 1.7 to 15.6 positives per 1000 individuals), with higher rates among people under 25 years of age (see Figure 2C). REACT-1 also reports vaccination status. As seen in Figure 2B, almost half of the unvaccinated group is aged between 5 and 12, while the most common age group in the vaccinated was 45–54 years old. While details differ, age bias is present in all observational data sets. To understand the impact of age bias, consider a hypothetical vaccine with zero efficacy. The vaccinated and unvaccinated groups’ case rates should be statistically identical if the vaccine were completely ineffective (Figure 2D). But age bias in observational data alters the age-weighted case rates in both the vaccinated and the unvaccinated groups, resulting in different infection rates by vaccination status. Since older people recorded lower infection rates, the age-weighted case rate of the (older) vaccinated group registered at 5.5 per 1000 while the corresponding value for the (younger) unvaccinated group was 11.2 per 1000 (Figure 2C). The resultant vaccine effectiveness, which is the relative ratio of these case rates, reflects the interaction between differential age distributions and the correlation of covid-19 incidence with age. The vaccine effectiveness appears as 51% even though the vaccine is completely ineffective by assumption. (Note that the direction of the age bias would reverse if older age groups had suffered higher case rates during the study period.). A viable adjustment method for this instance of Simpson's paradox19 induced by age bias should shift 51% back to zero. Simpson's paradox describes the condition in which aggregated and disaggregated analyses of the same data lead to contradictory findings, a common phenomenon in real-world data. Many observational studies incorporate an age term into regression models in an attempt to correct this age bias.4, 20, 21 But it has been discovered in a meta-analysis of influenza vaccine studies that standard regression adjustments insufficiently correct for the variety and magnitude of biases.22 From December 2020, the speedy dissemination of vaccines, particularly in wealthier nations (Figure 1), coincided with a period of plunging infection rates. However, accurately determining the contribution of vaccines to this decline is far from straightforward. Indeed, the considerable variation in case decline by country, such as the time lag observed in Israel—by far the quickest to reach 50% vaccinated relative to the UK and the United States—defies simple explanation (Figure 1, timepoint “B”). The sharp drop in infections complicates estimating vaccine effectiveness from observational data in a manner similar to age bias. The risk of virus exposure was considerably higher in January than in April. Thus exposure time was not balanced between unvaccinated and vaccinated individuals. Exposure time for the unvaccinated group was heavily weighted towards the early months of 2021 while the inverse pattern was observed in the vaccinated group. This imbalance is inescapable in the real world due to the timing of vaccination rollout. In addition, unlike trials, individuals in “real-world” studies do not stay in a single analysis subgroup throughout the study period: each person is unvaccinated on the first day of the study until the day of vaccination (or the end of the study should the person remain unvaccinated). Instead of crudely categorising individuals as either “vaccinated” or “unvaccinated,” many observational studies split each person's exposure time into an unvaccinated period followed by a vaccinated period if the individual got vaccinated.4-6 This technique is essential in contexts where the vast majority of the population becomes vaccinated, to avoid losing a comparison population. However, this procedure injects a strong bias into the analysis subgroups because the unvaccinated exposure time is heavily skewed to the early period in a study while the exposure time for vaccinated people skews towards the end of the study period. For a hypothetical vaccine with zero efficacy, the case rates for vaccinated and unvaccinated should be equal during each week of the study period. Indeed in RCTs, changes in background infection rate do not bias estimates of vaccine efficacy because by design, vaccine and placebo arms follow a synchronised dosing schedule that ensures exposure (at-risk) time is balanced, even in the context of changing infection rates. But background infection rate bias can cause estimates of vaccine efficacy in “real world” studies to vary widely from 0%. For example, using infection rate data from an actual observational study of Danish nursing home residents,20 where infection rates rapidly declined simultaneous with vaccine rollout (from 12 per 1000 residents in December 2020, to almost 0 during the last 2 weeks of the study),20 vaccine effectiveness of a hypothetically ineffective vaccine appears as 67%, an illusion chiefly created because unvaccinated people were preferentially exposed to the earlier weeks of higher background infection rates (Figure 3). We note that the direction of this bias would reverse if the background infection rate were to have steadily risen during the study period (i.e., vaccinating into a wave rather than out of one). The Danish study was one of the first “real-world” studies to recognise this background infection rate bias. The researchers added a “calendar time” adjustment term to their Cox regression model to address this bias, which reduced their estimate of vaccine effectiveness from 96% to 64%.20 However, as with age bias, we believe that regression adjustment is unlikely to sufficiently cure this type of imbalance. Because the regression equation was not published, we could not make a more definitive assessment. A recent commentary discussed multiple factors that can bias estimates of covid-19 vaccine effectiveness, such as vaccination status misclassification, testing differences, and disease risk factor confounding.7 Our article complements these observations by providing examples based on actual data sets that quantify how case-counting window bias, age bias, and background infection rate bias can profoundly complicate the analysis of observational studies, shifting covid-19 vaccine effectiveness estimates by an absolute magnitude as high as 50% to 70%. Randomised trials aim to mitigate these biases by virtue of design features, such as randomisation, placebo controls, and blinding. But while randomised trials should offer far superior protection against these biases, premarketing trials left many important questions unstudied, such as the durability of protection, interaction with other countermeasures, and effectiveness in highest-risk and other important subpopulations. Pragmatic, placebo-controlled randomised trials might have addressed some of these limitations, but after manufacturers began unblinding their trials following the emergency use authorisation in December 2020, observational studies are all we have. Our analysis shows that real-world conditions such as non-randomised vaccination, crossovers, and trends in background infection rates introduce strong, complex biases into these observational datasets. Our contribution is to size up three important biases, the magnitude of which surprised us and may surprise you. We conclude that “real-world” studies using methodologies popular in early 2021 overstate vaccine effectiveness. Our finding highlights how difficult it is to conduct high-quality observational studies during a pandemic. While the current situation leaves much to be desired, several steps can be taken going forward to enhance the quality of observational studies. Greater awareness of these biases could promote more appropriate adjustments in future studies, including using quasi-experimental methods. In addition, journal editors could improve transparency and reproducibility of observational studies by requiring the disclosure of underlying data and code, as well as publishing modelling equations, tables of coefficients, and standard errors.23 Data availability severely restricted our choice of studies to examine, and also prevented us from analysing all three biases simultaneously, among the ones we selected. As shown in Table 2, we would have needed additional information, such as (a) cases from first dose by vaccination status; (b) age distribution by vaccination status; (c) case rates by vaccination status by age group; (d) match rates between vaccinated and unvaccinated groups on key matching variables; (e) background infection rate by week of study; and (f) case rate by week of study by vaccination status. In future work, we hope to analyse examples using hospitalisations or deaths as endpoints, which is possible only with broader data disclosure. The pandemic offers a magnificent opportunity to recalibrate our expectations about both observational and randomised studies. “Real world” studies today are still published as one-off, point-in-time analyses. But much more value would come from having results posted to a website with live updates, as epidemiological and vaccination data accrue. Continuous reporting would allow researchers to demonstrate that their analytical methods not only explain what happened during the study period but also generalise beyond it. Finally, randomised studies should not be considered irrelevant in the post-authorisation phase. An element of randomisation can be incorporated into real world vaccine distribution. Where populations are still largely unvaccinated and resources do not allow vaccinating everybody at once, designs such as the stepped-wedge cluster randomised rollout24, 25 should be given serious consideration for their ability to ethically derive important scientific information. Any tool that eliminates some amount of real-world bias would reduce the complexity of analysing observational data. Kaiser Fung and Peter Doshi came up with the idea for the paper, Kaiser Fung carried out the statistical analyses and wrote the first draft. All authors were involved in discussing the content, presentation, and editing the manuscript. We have the following interests to declare: Peter Doshi has received travel funds from the European Respiratory Society (2012) and Uppsala Monitoring Center (2018); grants from the FDA (through University of Maryland M-CERSI; 2020), Laura and John Arnold Foundation (2017-22), American Association of Colleges of Pharmacy (2015), Patient-Centered Outcomes Research Institute (2014-16), Cochrane Methods Innovations Fund (2016-18), and UK National Institute for Health Research (2011-14); was an unpaid IMEDS steering committee member at the Reagan-Udall Foundation for the FDA (2016-2020), and is an editor at The BMJ. KF, MJ: None. Data sharing is not applicable to this article as no new data were created or analysed in this study.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call