Abstract

Many datasets describing contacts in a population suffer from incompleteness due to population sampling and underreporting of contacts. Data-driven simulations of spreading processes using such incomplete data lead to an underestimation of the epidemic risk, and it is therefore important to devise methods to correct this bias. We focus here on a non-uniform sampling of the contacts between individuals, aimed at mimicking the results of diaries or surveys, and consider as case studies two datasets collected in different contexts. We show that using surrogate data built using a method developed in the case of uniform population sampling yields an improvement with respect to the use of the sampled data but is strongly limited by the underestimation of the link density in the sampled network. We put forward a second method to build surrogate data that assumes knowledge of the density of links within one of the groups forming the population. We show that it gives very good results when the population is strongly structured, and discuss its limitations in the case of a population with a weaker group structure. These limitations highlight the interest of measurements using wearable sensors able to yield accurate information on the structure and durations of contacts.

Highlights

  • An increasing number of studies on epidemic spreading processes use data-driven models

  • It is of interest to understand how the resulting data incompleteness or limited resolution affects the properties of the measured contact network[20,21,22,23], how it affects the outcome of data-driven models using incomplete data[20, 23,24,25], and most importantly if it is possible to infer the real network structure or statistical properties from incomplete information[26,27,28] and/or to devise methods to correctly estimate the epidemic risk even from incomplete data23, 29 [Note that, for some types of wearable sensors, the opposite problem of false positives, i.e., of reported contacts that are not relevant for propagation events, can arise

  • Since many datasets are de facto incomplete, it is important to assess how data incompleteness affects the outcome of data-driven simulations, how the resulting biases can be compensated, and how much data is needed for the simulations[23, 29,30,31,32]

Read more

Summary

Introduction

An increasing number of studies on epidemic spreading processes use data-driven models. A similar method has been shown to work well in a case study of contact diaries collected together with data from wearable sensors[29]: not all contacts were reported in the diaries, building surrogate data using the contact matrix measured in the diaries and publicly available statistics on contact durations made it possible to correctly estimate the outcome of simulations of spreading processes. In these two studies, the density of the sampled data was either equal (for uniform population sampling) or close (for the diaries) to the one of the original data.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.