High-quality, typically administrative, geospatial data should adhere to established measurement and representation practices and be protected from malicious attacks. However, this kind of geospatial data may only be infrequently updated due to its often prolonged production process compared to a data source of volunteered geographic information such as OpenStreetMap. Existing approaches typically try to quality-assure geospatial data by comparing it to another reference data set of perceived higher quality - often another administrative dataset facing a similar update cycle. In contrast, this article tries to determine whether actual changes present in volunteered geographic information data such as OpenStreetMap, which also need to be applied in an administrative dataset (i.e., consists of actual changes in the real world), can be identified automatically. To that end, we present QPredict, a machine learning approach observing changes in volunteered geospatial data such as OpenStreetMap to predict issues with a target (administrative) data set. The algorithm is trained by exploiting geospatial object characteristics, intrinsic and extrinsic quality metrics and their respective changes over time. We evaluate the effectiveness of our approach on two data sets representing two mid-size cities in Germany and discuss our findings in terms of their applicability in use cases.
Read full abstract