Constructing synthetic populations in the age of big data

Mioara A Nicolaie,Koen Füssenich,Caroline Ameling,Hendriek C Boshuizen

doi:10.1186/s12963-023-00319-5

Abstract

BackgroundTo develop public health intervention models using micro-simulations, extensive personal information about inhabitants is needed, such as socio-demographic, economic and health figures. Confidentiality is an essential characteristic of such data, while the data should reflect realistic scenarios. Collection of such data is possible only in secured environments and not directly available for open-source micro-simulation models. The aim of this paper is to illustrate a method of construction of synthetic data by predicting individual features through models based on confidential data on health and socio-economic determinants of the entire Dutch population.MethodsAdministrative records and health registry data were linked to socio-economic characteristics and self-reported lifestyle factors. For the entire Dutch population (n = 16,778,708), all socio-demographic information except lifestyle factors was available. Lifestyle factors were available from the 2012 Dutch Health Monitor (n = 370,835). Regression model was used to sequentially predict individual features.ResultsThe synthetic population resembles the original confidential population. Features predicted in the first stages of the sequential procedure are virtually similar to those in the original population, while those predicted in later stages of the sequential procedure carry the accumulation of limitations furthered by data quality and previously modelled features.ConclusionsBy combining socio-demographic, economic, health and lifestyle related data at individual level on a large scale, our method provides us with a powerful tool to construct a synthetic population of good quality and with no confidentiality issues.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Population Health Metrics	Publication Date: Oct 31, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Constructing synthetic populations in the age of big data

Abstract

Talk to us

Similar Papers

More From: Population Health Metrics

Lead the way for us

Similar Papers

Indigenous health equity in health register ascertainment and data quality: a narrative review
Karen Wright ... Anna Mackey
International Journal for Equity in Health | VOL. 21
Karen Wright, et. al.Karen Wright ... Anna Mackey
12 Mar 2022
International Journal for Equity in Health | VOL. 21

The double burden of malnutrition in indigenous and nonindigenous Guatemalan populations
Manuel Ramirez-Zea ... Rebecca Kanter
The American Journal of Clinical Nutrition | VOL. 100
Manuel Ramirez-Zea, et. al.Manuel Ramirez-Zea ... Rebecca Kanter
01 Dec 2014
The American Journal of Clinical Nutrition | VOL. 100

Lifestyle Factors and Risk for Symptomatic Gastroesophageal Reflux in Monozygotic Twins
Zongli Zheng ... Weimin Ye
Gastroenterology | VOL. 132
Zongli Zheng, et. al.Zongli Zheng ... Weimin Ye
17 Nov 2006
Gastroenterology | VOL. 132

Family Physicians' Experiences With Community Mental Health Centers: A Multilevel Analysis
Oyvind Andresen Bjertnaes
Psychiatric Services | VOL. 59
Oyvind Andresen BjertnaesOyvind Andresen Bjertnaes
01 Aug 2008
Psychiatric Services | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Constructing synthetic populations in the age of big data

Abstract

Talk to us

Similar Papers

More From: Population Health Metrics