Abstract

The management of the COVID-19 pandemic presents several unprecedented challenges in different fields, from medicine to biology, from public health to social science, that may benefit from computing methods able to integrate the increasing available COVID-19 and related data (e.g., pollution, demographics, climate, etc.). With the aim to face the COVID-19 data collection, harmonization and integration problems, we present the design and development of COVID-WAREHOUSE, a data warehouse that models, integrates and stores the COVID-19 data made available daily by the Italian Protezione Civile Department and several pollution and climate data made available by the Italian Regions. After an automatic ETL (Extraction, Transformation and Loading) step, COVID-19 cases, pollution measures and climate data, are integrated and organized using the Dimensional Fact Model, using two main dimensions: time and geographical location. COVID-WAREHOUSE supports OLAP (On-Line Analytical Processing) analysis, provides a heatmap visualizer, and allows easy extraction of selected data for further analysis. The proposed tool can be used in the context of Public Health to underline how the pandemic is spreading, with respect to time and geographical location, and to correlate the pandemic to pollution and climate data in a specific region. Moreover, public decision-makers could use the tool to discover combinations of pollution and climate conditions correlated to an increase of the pandemic, and thus, they could act in a consequent manner. Case studies based on data cubes built on data from Lombardia and Puglia regions are discussed. Our preliminary findings indicate that COVID-19 pandemic is significantly spread in regions characterized by high concentration of particulate in the air and the absence of rain and wind, as even stated in other works available in literature.

Highlights

  • The COVID-19 (COronaVIrus Disease 2019) outbreak is caused by a novel coronavirus namedSevere Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) [1] and has been classified as a pandemic disease by the World Health Organization (WHO) on 12 March 2020

  • COVID-19 poses many challenges to several research and application fields that regard, to cite a few: molecular basis of the disease, virus mutations, vaccines and drugs, diagnosis and therapy, ICUs (Intensive Care Units) management [4], healthcare logistic [5], large scale testing of people to find diseased people and already healed people, large scale tracing of people movements and contacts to reduce the spread of the virus, infectious disease modeling [6], epidemiology, public health, effects of pandemic at emotional and behaviour level [7], impact of pandemic on remote working [8], etc. Each one of these challenges may benefit from advanced computing infrastructures and novel software pipelines [9], here we focus on the issue of data integration of publicly available COVID-19 data, that may simplify data visualization and aggregation, e.g., for decision making and for focusing on specific aspects of the problem, or may simplify the connection of such disease data with environmental and climate data [10,11]

  • All the analyses described would have not been possible without the ability of COVID-WAREHOUSE to integrate Italian COVID-19 data to air pollution and climate data

Read more

Summary

Introduction

Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) [1] and has been classified as a pandemic disease by the World Health Organization (WHO) on 12 March 2020. We first report some recent initiatives aiming to build data warehouses of COVID-19 data, along with some open-source initiatives to build data warehouse systems. BIRT (http://www.eclipse.org/birt/) is an open-source software project that provides the technologies and platform to create data visualizations and reports that can be embedded in rich client and web applications. BIRT is a top-level software project within the Eclipse Foundation, an independent not-for-profit consortium of software industry vendors and an open-source community. BIRT comprises two main components: (i) a visual report designer for creating BIRT Designs, and (ii) a runtime component for generating designs that can be deployed on any Java environment

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call