Harnessing spatio‐temporal patterns in data for nominal attribute imputation

Rajesh Chittor Sundaram,Elham Naghizade,Renata Borovica‐Gajic,Martin Tomko

doi:10.1111/tgis.12617

Rajesh Chittor Sundaram, Elham Naghizade + Show 2 more

Open Access

https://doi.org/10.1111/tgis.12617

Copy DOI

Journal: Transactions in GIS : TG	Publication Date: Mar 31, 2020
Citations: 4	License type: CC BY 4.0

Affiliation: University of Melbourne

Abstract

AbstractMissing data in Volunteered Geographic Information (VGI) are an unavoidable consequence of data collection by non‐experts, guided by only vague and informal mapping guidelines. While various Missing Value Imputation (MVI) techniques have been proposed as data cleansing strategies, they have primarily targeted numerical data attributes in non‐spatial databases. There remains a significant gap in methods for imputing nominal attribute values (e.g., Street Name) in map databases. Here, we present an imputation algorithm called the Membership Imputation Algorithm (MIA), targeting spatial databases and enabling imputation of nominal values in spatially referenced records. By targeting membership classes of spatial objects, MIA harnesses spatio‐temporal characteristics of data and proposes efficient heuristics to impute the class name (i.e., a membership). Experimental results show that the proposed algorithm is able to impute the membership with high levels of accuracy (over 94%) when assigning Street Name(s), across highly diverse regional contexts. MIA is effective in challenging spatial contexts such as street intersections. Our research serves as a first step in highlighting the effectiveness of spatio‐temporal measures as a key driver for nominal imputation techniques.

Highlights

Many real world data sets are dirty (Prasad et al, 2011)
We propose the Membership Imputation Algorithm (MIA), which imputes the nominal attributes of an OSM relation for any map feature, by evaluating the spatial and temporal proximity of the neighboring map features that already belong to an existing relation
Intersection entities are analyzed separately because the neighborhood of entities at intersections presents a unique challenge, due to neighbors being distributed across multiple Associated Street Relation (ASR) membership classes, in comparison to a neighborhood around a given street, a pattern dominating the overall data set

Summary

Introduction

Many real world data sets are dirty (Prasad et al, 2011). The term dirty data refers to data sets with issues such as missing or incorrect records or values (Simoudis, Livezey, & Kerber, 1995), non-standard representations (Williams, 1997), outliers (Hawkins, He, Williams, & Baxter, 2002), and duplicate values (Hernández & Stolfo, 1998). OpenStreetMap (OSM) (https://www.openstreetmap.org.), the most prominent VGI data source, is heavily impacted by map features with incomplete attribute data (Davidovic, Mooney, Stoimenov, & Minghini, 2016). This is a general issue prominent in databases without a strict schema or data definition rules. In OSM, the free tagging system allows the contributors to use an unlimited number of attributes to describe a map feature This free-form nature of tagging, coupled with a lack of adherence to community guidelines (https://wiki.openstreetmap.org/wiki/Tagging.), results in considerable missing data for features

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Harnessing spatio‐temporal patterns in data for nominal attribute imputation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions in GIS : TG

Lead the way for us

Similar Papers

Business Intelligence Techniques for Missing Data Imputations
...
-
, et. al. ...
02 Nov 2015
02 Nov 2015

Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets.
Min-Wei Huang ... Wei-Chao Lin
Journal of Healthcare Engineering | VOL. 2018
Min-Wei Huang, et. al.Min-Wei Huang ... Wei-Chao Lin
01 Jan 2018
Journal of Healthcare Engineering | VOL. 2018

Volunteered Geographic Information constructions in a contested terrain: A case of OpenStreetMap in China
Wen Lin
Geoforum | VOL. 89
Wen LinWen Lin
01 Feb 2018
Geoforum | VOL. 89

A machine learning approach for imputation and anomaly detection in IoT environment
Radhakrishna Vangipuram ... Rajesh Kumar Gunupudi
Expert Systems | VOL. 37
Radhakrishna Vangipuram, et. al.Radhakrishna Vangipuram ... Rajesh Kumar Gunupudi
13 Apr 2020
Expert Systems | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Harnessing spatio‐temporal patterns in data for nominal attribute imputation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions in GIS : TG