Abstract
SummaryMany computational models rely on real-world data, and the steps required in moving from data collection, to data preparation, to model calibration, and input are becoming increasingly complex. Errors in data can lead to errors in model output that might invalidate conclusions in extreme cases. While the challenge of errors in data collection have been analyzed in the literature, here we highlight the importance of data handling in the modeling and simulation process, and how particular data handling errors can lead to errors in model output. We develop a framework for assessing the impact of potential data errors for models of spreading processes on networks, a broad class of models that capture many important real-world phenomena (e.g., epidemics, rumor spread, etc.). We focus on the susceptible-infected-removed (SIR) and Threshold models and examine how systematic errors in data handling impact the predicted spread of a virus (or information). Our results demonstrate that data handling errors can have significant impact on model conclusions especially in critical regions of a system.
Highlights
The modern computing revolution has led to data science techniques, and in particular computational modeling, being applied in a wide range of fields including sociology,[1] psychology,[2] chemistry,[3] and physics.[4]
Background and related work Before describing the related work, we briefly introduce the network terminology used in this paper
Network terminology A network is a graph, G, made up of a pair ðV;EÞ, where V is a set of vertices, and E is a set of paired vertices called edges.[12]
Summary
The modern computing revolution has led to data science techniques, and in particular computational modeling, being applied in a wide range of fields including sociology,[1] psychology,[2] chemistry,[3] and physics.[4]. Inherent in the modeling process is the principle of abstraction, a process that aims to condense processes and phenomena into their most basic ingredients. Network terminology A network is a graph, G, made up of a pair ðV;EÞ, where V is a set of vertices (nodes), and E is a set of paired vertices called edges (links).[12]. The links in a network can be directed or undirected. When applied to real-world systems, nodes represent entities and edges represent links between those entities. If modeling the spread of a virus transmitted through contact, a network can be constructed with nodes representing persons and edges between them representing contact. We list here some key basic network properties:
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.