Abstract
BackgroundDespite international migrants comprising 15·6% of the English population, there are no large-scale studies of migrant health in UK primary care electronic health records (EHRs). Developing and validating a migration phenotype (a transparent reproducible algorithm based on EHRs to identify migrants) is necessary to determine the feasibility of using EHRs for migration health research. This study aims to develop and validate a migrant phenotype in Clinical Practice Research Datalink (CPRD), the largest UK primary care EHR. MethodsThis is a population-based cohort study of individuals of any age in CPRD between Jan 1, 2007, and Feb 29, 2016, with a diagnostic Read term indicating international migration. We describe completeness of recording of migration: percentage of individuals recorded as migrants over time. We also describe representativeness of the cohort (age, sex, and geographical origin) compared with data from the Office of National Statistics (ONS; country of birth and the 2011 English Census). Findings325 391 (3·4%) of 9,448,898 individuals in CPRD had at least one of 440 terms indicating international migration. The cohort was mostly female (53·7% [174 883/325 391] overall; 52·4% [55 734/106 462] in 2011), which is similar to ONS 2011 census data (51·7 [3 791 375/7 337 139]). The percentage of migrants per year increased from 1·2% (69 046/5 716 075) in 2007 to 2·8 (154 525/5 427 745) in 2013, following a similar trend to ONS migration data (11·7% [5 927 000/50 714 000] in 2007; 13·7% [7 285 000/53 164 000] in 2013). Proportions were significantly lower in CPRD (χ2 test; p<0·0001). The highest percentages of migrants were in the 25–34-year-old band (4·6% [30 549/668 864] in CPRD; 25·9% [1 851 952/7 160 102] in ONS). Migrants were mostly born in Europe (35·4% [10 316/29 113] in CPRD; 36·5% [2 675 003/7 337 042] in ONS) or the Middle East and Asia (34·5% [10 037/29 113] in CPRD; 34·5% [2 529 137/7 337 042] in ONS). InterpretationWe created a cohort of international migrants in England that is broadly representative in terms of age, sex, and geographical region of origin. Future validation work should explore representativeness by ethnicity and deprivation. Potential reasons for undersampling compared with ONS data include insufficient recording and poor health-care access. Nonetheless, the large cohort size provides sufficient power to study a range of health-care analyses in this potentially underserved population. FundingWellcome Trust (approvals [CPRD ISAC 19_062R]; REC 09/H0810/16).
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have