Abstract

In order to understand the health outcomes for distinct sub-groups of the population or across different geographies, it is advantageous to be able to build bespoke groupings from individual level data. Individuals possess distinct characteristics, exhibit distinct behaviours and accumulate their own unique history of exposure or experiences. However, in most disciplines, not least public health, there is a lack of individual level data available outside of secure settings, especially covering large portions of the population. This paper provides detail on the creation of a synthetic micro dataset for individuals in Great Britain who have detailed attributes which can be used to model a wide range of health and other outcomes. These attributes are constructed from a range of sources including the United Kingdom Census, survey and administrative datasets. It provides a rationale for the need for this synthetic population, discusses methods for creating this dataset and provides some example results of different attribute distributions for distinct sub-population groups and over different geographical areas.

Highlights

  • Background & SummaryOne of the central issues that researchers and policy makers face when modelling outcomes in a public health context is access to spatially representative individual-level data

  • We present the rationale for, and microsimulation methods used to construct a synthetic population used by the SIPHER (Systems Science in Public Health and Health Economics Research) consortium, a collaboration of researchers from seven universities, three government partners and 12 practice partners

  • In this paper we present details of the construction and validation of the synthetic population for Great Britain (GB), and show the population synthesis results for several geographical areas as an example of data: the city region of Greater Manchester, Sheffield local authority district, Glasgow council area and Cardiff local authority district

Read more

Summary

Background & Summary

One of the central issues that researchers and policy makers face when modelling outcomes in a public health context is access to spatially representative individual-level data. Microsimulation uses attribute-rich individual-level sample data to estimate the characteristics of a larger population[1,2] An extension of this approach that explicitly accounts for spatial distributions is often termed spatial microsimulation[3]. Spatial microsimulation adds geographical constraints and allows for the synthesis of individuals within defined geographical zones[8] This combines the advantages of non-spatial attribute-rich microdata with geographically aggregated data to synthesise a population of individuals containing characteristics from both sources. It has been widely applied in many fields such as population projections (e.g.9,10), health studies (e.g.11,12), transport analysis (e.g.13,14), policy evaluation (e.g.15,16) and assessment of deprivation and inequality (e.g.17,18). Our framework can be used to create additional microdata using other survey or administrative datasets which contain individual level information

Methods
Findings
Code availability
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call