BackgroundThroughout the Covid-19 pandemic, researchers have made use of electronic health records to research this disease in a rapidly evolving environment of questions and discoveries. These studies are prone to collider bias as they restrict the population of Covid-19 patients to only those with severe disease. Inverse probability weighting is typically used to correct for this bias but requires information from the unrestricted population. Using electronic health records from a South London NHS trust, this work demonstrates a method to correct for collider bias using externally sourced data while examining the relationship between minority ethnicities and poor Covid-19 outcomes.MethodsThe probability of inclusion within the observed hospitalised cohort was modelled based on estimates from published national data. The model described the relationship between patient ethnicity, hospitalisation, and death due to Covid-19 – a relationship suggested to be susceptible to collider bias. The obtained probabilities (as applied to the observed patient cohort) were used as inverse probability weights in survival analysis examining ethnicity (and covariates) as a risk factor for death due to Covid-19.ResultsWithin the observed cohort, unweighted analysis of survival suggested a reduced risk of death in those of Black ethnicity – differing from the published literature. Applying inverse probability weights to this analysis amended this aberrant result to one more compatible with the literature. This effect was consistent when the analysis was applied to patients within only the first wave of Covid-19 and across two waves of Covid-19 and was robust against adjustments to the modelled relationship between hospitalisation, patient ethnicity, and death due to Covid-19 made as part of a sensitivity analysis.ConclusionsIn conclusion, this analysis demonstrates the feasibility of using external publications to correct for collider bias (or other forms of selection bias) induced by the restriction of a population to a hospitalised cohort using an example from the recent Covid-19 pandemic.
Read full abstract