Abstract

This paper addresses the problem of learning the concept of propagation in the theoretical formalism of pretopology, and then applying this methodology for the well-known problem of learning Lexical Taxonomy. The theory of pretopology, among others, aims at modeling complex relations between sets of entities. The use of such fine-grained modeling implies limitations in terms of scalability. However, it allows for a more accurate capture of real-world relationships, such as the hypernymy relation, by modeling the task of relation extraction as a propagation model under certain structuring constraints, as opposed to traditional approaches that are limited to detecting relations between pairs of elements without considering knowledge on the expected structuring.Our proposal is to define the pseudo-closure operator (modeling the concept of propagation) as a logical combination of heterogeneous neighborhoods, or sources. It allows the learning of models that exploit, for example, the knowledge acquired by both statistical and numerical approaches. We show that the learning of such an operator falls into the Multiple Instance (MI) framework, where the learning process is performed on bags of instances instead of individual instances. Although this framework is well suited for this task, using it for learning a pretopological space leads to a set of bags whose size is exponential. To overcome this problem, we propose a learning method (LPSMI) based on a low estimate of the bags covered by a concept under construction.We first propose an experimental validation of our method, through the simulation of percolation processes (typically forest fires) learned with pretopological propagation models. It reveals that the proposed MI approach is particularly efficient on propagation model recognition task. We then provide a real-world contribution to the Lexical Taxonomy learning task, by modeling this task as a complex (semantic) propagation problem. We propose a very generic framework for training models combining various existing methods for learning Lexical Taxonomies (statistical, pattern-based and embedding-based).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call