Abstract

Social media and other forms of volunteered geographic information (VGI) are used frequently as a source of fine-grained big data for research. While employing geographically referenced social media data for a wide array of purposes has become commonplace, the relevant scales over which these data apply to is typically unknown. For researchers to use VGI appropriately (e.g., aggregated to areal units (e.g., neighbourhoods) to elicit key trend or demographic information), general methods for assessing the quality are required, particularly, the explicit linkage of data quality and relevant spatial scales, as there are no accepted standards or sampling controls. We present a data quality metric, the Spatial-comprehensiveness Index (S-COM), which can delineate feasible study areas or spatial extents based on the quality of uneven and dynamic geographically referenced VGI. This scale-sensitive approach to analyzing VGI is demonstrated over different grains with data from two citizen science initiatives. The S-COM index can be used both to assess feasible study extents based on coverage, user-heterogeneity, and density and to find feasible sub-study areas from a larger, indefinite area. The results identified sub-study areas of VGI for focused analysis, allowing for a larger adoption of a similar methodology in multi-scale analyses of VGI.

Highlights

  • Volunteered geographic information (VGI) [1], produced in full or in part by citizens, has been shown to have scientific, social, and cultural value

  • The irregular nature of informal data-authoring processes creates sampling patterns that often do not correspond to well-defined natural or administrative study areas for analysis. We explore this issue through the lens of a new data quality property termed “spatial data comprehensiveness” which we define as a “suitably even distribution of data observations in terms of coverage and density to fit the analytical needs within a geographic area”

  • We present a new index that is designed to reduce the uncertainties implicit in the inability to control sampling by identifying feasible study areas within VGI datasets based on evaluation of a composite index of spatial data comprehensiveness (S-COM) without reference to external data

Read more

Summary

Introduction

Volunteered geographic information (VGI) [1], produced in full or in part by citizens, has been shown to have scientific, social, and cultural value. When little is known about the data-authoring process giving rise to a VGI dataset, evaluating the evenness of contributions is critical to identify the relevant spatial scale for study This is an especially important consideration when the geographic phenomena being investigated are in themselves fuzzy and uncertain such as what individuals consider as a city’s “downtown”. Data from two citizen science projects, RinkWatch and FrogWatch, are used to illustrate different contexts where finding feasible extents and aggregation units is a required first step of VGI point dataset analysis These two projects employ web-based maps and interfaces in order to solicit citizen-reporting of individual observations of ice rink conditions (RinkWatch) and frog or toad sightings (FrogWatch). While these initiatives have distinct goals, they are both used to gauge local variations of environmental variables, such as temperature and habitat quality, respectively

Background
Methods
Case Study
RinkWatch
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call