Abstract

Knowledge or rule-based approaches are needed for quality assessment and assurance in professional or crowdsourced geographic data. Nevertheless, many types of geographic knowledge are statistical in nature and are therefore difficult to derive rules that are meaningful for this purpose. The rules of continuity and symmetry considered in this paper can be thought of as two concrete forms of the first law of geography, which may be used to formulate quality measures at the individual level without referring to ground truth. It is not clear, however, how much the rules can be faithful. Hence, the main objective is to test if the rules are consistent with street network data over the world. Specifically, for the rule of continuity we identify natural streets that connect smoothly in a network, and measure the spatial order of information (e.g. names, highway level, speed, etc.) along the streets. The measure is based on spatial auto-correlation indicators adapted for one dimension. For the rule of symmetry, we device an algorithm that recognize parallel road pairs (e.g. dual carriageways), and examine to what extent attributes in the pairs are identical. The two rules are tested against 28 cities selected from OpenStreetMap data worldwide; two professional data sets are used to show more insights. We found that the rules are consistent with street networks from a wide range of cities of different characteristics, and also noted cases with varying degrees of agreement. As a side-effect, we discussed possible limitations of the autocorrelation indicators used, where cautions are needed when interpreting the results. In addition, we present techniques that performed the tests automatically, which can be applied to new data to further verify (or falsify) our findings, or extended as quality assurance tools to detect data items that do not satisfy the rules and to suggest possible corrections according to the rules.

Highlights

  • IntroductionCrowdsourced geographic information or volunteered geographic information (VGI) [1] is an important source for gathering data/facts about our world, complementary to the traditional

  • We present the join-count statistic (JCS) formulation and derive some reduced forms that are suitable for one dimensional cases, and we assume sampling without replacement as it is more realistic for geographic properties

  • In this paper we tested two rules that can be used to assess the quality of OSM data. They are the rules of continuity and symmetry and can be thought of as concrete forms of the first law of geography

Read more

Summary

Introduction

Crowdsourced geographic information or volunteered geographic information (VGI) [1] is an important source for gathering data/facts about our world, complementary to the traditional. Continuity and symmetry rules in street networks (grant 2017YFB0503500); XZ by the National Natural Science Foundation of China (grant 41671384 and 41301410); TA by the National High Technology Research and Development Program of China (grant 2015AA123901). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Hi-Target Surveying Instrument Co., Ltd., Wuhan Hi-Target Digital Cloud Technology Co., Ltd. and NavInfo Co., Ltd. provided support in the form of salaries for authors JY and ZW, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.