Abstract

In this study, we investigate the key characteristics of more than 1500 COVID-19 datasets appearing on 35 Open Government Data (OGD) sites around the world. In addition to examining the number of datasets, data formats, star levels (a measure of data formats), and data levels, we also investigated the topical focus of individual dataset content by analyzing the text words of dataset titles and descriptions. Significant differences among the world regions were found in both the number of datasets and the number of data formats. Regional differences were also found among various star levels. In addition, regional differences were found in the topic contributions to two specific clusters. As one of the first empirical studies examining various characteristics of COVID-19 datasets on OGDs, our research fills the gaps in this area. The findings of our study will benefit not only the government in improving their OGDs’ public health data provision by ensuring accuracy, consistency, and timeliness, but also benefit those OGD users who are public health professionals who can assess the progress and identify gaps in the community, or everyday citizens who access OGD datasets to obtain an understanding of the development and risk levels of the crisis. Future research may use the same method to expand the examination to the general “health” datasets accessible from OGDs worldwide to help improve the sharing and reuse of health datasets for the general public.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call