Abstract

The main goal of this paper is to extract the semantic relations underpinning the concepts of English prepositional of-constructions derived from poetic and non-poetic datasets, using Princeton WordNet. The problem is addressed by two different algorithms, which are evaluated for their ability to model the different types of resources from which the relations are derived, and for their ability to predict unseen relations. The first algorithm introduces the concept of subsumption hierarchy between relations in order to derive the most general relations associated to each type of data source and identify a set of relations specific to each dataset. The second algorithm investigates the use of a weighting scheme in order to establish the importance of each association extracted. Of particular importance are the notions of subsumption hierarchies between relations (expressed as synset pairs) and the Inverse Relation Frequency (IRF) measure, which is inspired by the Inverse Document Frequency measure used in Information Retrieval. The ontological prospects of using Princeton WordNet and the above algorithms for the creation of ontologies are also briefly discussed. Although the main interest of the proposed methods lies to the identification of conceptual relations particular to poetic resources, the methods followed can be applied and are evaluated on other domains too.

Highlights

  • Several classification algorithms for natural language resources are cited in literature with the most promising ones over the recent years focusing on the use of deep learning methods for the automatic classification of phrases and texts [1]–[3]

  • What is missing from these approaches is the extraction and representation of the semantic relations between concepts that play a prominent role in the representation of domains, and in the prediction of resources from which phrases are derived by reference to the semantic relations specific to each resource

  • The primary goal in this paper is twofold: (i) to study the use of Princeton WordNet (PWN) [25] in the extraction of semantic relationships underpinning poetic versus non-poetic datasets or domains of of-prepositions, and (ii) to use the relations extracted from a subset of each type of resource, in order to predict the resource type of relations, not included in the training set

Read more

Summary

Introduction

Several classification algorithms for natural language resources are cited in literature with the most promising ones over the recent years focusing on the use of deep learning methods for the automatic classification of phrases and texts [1]–[3]. The primary goal in this paper is twofold: (i) to study the use of Princeton WordNet (PWN) [25] in the extraction of semantic relationships underpinning poetic versus non-poetic datasets or domains of of-prepositions, and (ii) to use the relations extracted from a (training) subset of each type of resource, in order to predict the resource type of relations, not included in the training set. The experiments were contacted primarily on poetic and non-poetic resources, the proposed methods can be used to model any type of resources. For this reason, they have been tested on other domains too. The sense director.n.01 occurs more frequently than the sense director.n.02

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call