Abstract

Operating with ignorance is an important concern of geographical information science when the objective is to discover knowledge from the imperfect spatial data. Data mining (driven by knowledge discovery tools) is about processing available (observed, known, and understood) samples of data aiming to build a model (e.g., a classifier) to handle data samples that are not yet observed, known, or understood. These tools traditionally take semantically labeled samples of the available data (known facts) as an input for learning. We want to challenge the indispensability of this approach, and we suggest considering the things the other way around. What if the task would be as follows: how to build a model based on the semantics of our ignorance, i.e., by processing the shape of “voids” within the available data space? Can we improve traditional classification by also modeling the ignorance? In this paper, we provide some algorithms for the discovery and visualization of the ignorance zones in two-dimensional data spaces and design two ignorance-aware smart prototype selection techniques (incremental and adversarial) to improve the performance of the nearest neighbor classifiers. We present experiments with artificial and real datasets to test the concept of the usefulness of ignorance semantics discovery.

Highlights

  • Empty spaces – what are we living for− Queen (1991)

  • Assumption; in Section 3, we suggest several different approaches to defining, capturing, and visualizing ignorance; in Section 4, we present the generic model of ignorance discovery, which takes into account the distribution of data within the domain and the shape of the domain boundary; in Section 5, we present one of possible use cases for ignorance discovery, two ignorance-aware algorithms for prototype selection in supervised instance-based learning, and we experimentally demonstrate the added value provided by the ignorance awareness

  • The idea of the Adversarial Prototype Selection (APS) algorithm, which we present in this paper, fits well the concept of a generative adversarial network [38], i.e., assuming that the student acts as a generative model and the professor acts as a discriminative one

Read more

Summary

Introduction

Fighting data and knowledge imperfection, from a slight uncertainty to a complete ignorance, has been an important agenda for geographic information systems (GIS) for many years. The author argues for accepting uncertainty and ignorance as natural and deep-rooted properties of complex knowledge, which need to be studied rather than excised. Ignorance may have some common properties with the information we already know. Yuan et al [6] argue that geographic data have unique properties, which require special consideration. Size, shape, and boundaries of geographic objects can affect the data mining and automated knowledge discovery about the geographic processes, meaning that geographical objects cannot necessarily be reduced to points without information loss. We believe that ignorance in the GIS context has similar meaningful properties (size, shape, and boundaries), which is an additional opportunity for knowledge discovery rather than a threat

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call