Abstract

An ontology defines a set of representational primitives which model a domain of knowledge or discourse. With the arising fields such as information extraction and knowledge management, the role of ontology has become a driving factor of many modern day systems. Ontology population, on the other hand, is an inherently problematic process, as it needs manual intervention to prevent the conceptual drift. The semantic sensitive word embedding has become a popular topic in natural language processing with its capability to cope with the semantic challenges. Incorporating domain specific semantic similarity with the word embeddings could potentially improve the performance in terms of semantic similarity in specific domains. Thus, in this study, we propose a novel way of semi-supervised ontology population through word embeddings and domain specific semantic similarity as the basis. We built several models including traditional benchmark models and new types of models which are based on word embeddings. Finally, we ensemble them together to come up with a synergistic model which outperformed the candidate models by 33% in comparison to the best performed candidate model.

Highlights

  • In various computational tasks in many different fields, the use of ontologies is becoming increasingly involved.Many of the research areas such as knowledge engineering and representation, information retrieval and extraction, and knowledge management and agent systems [1] have been incorporated with the use of ontologies to a greater extent.As defined by Thomas R

  • Word Embeddings could be identified as a collective name for a set of language modelling and feature learning techniques in natural language processing

  • In almost all Natural Language Processing (NLP) tasks such as Information Retrieval, Information Extraction, and Natural Language Understanding [15], Semantic Similarity measurements based on linguistic features are a fundamental component

Read more

Summary

INTRODUCTION

In various computational tasks in many different fields, the use of ontologies is becoming increasingly involved. We propose a novel way for semi-supervised instance population of an ontology using word vector embeddings. Word Embeddings could be identified as a collective name for a set of language modelling and feature learning techniques in natural language processing. The basic idea behind word embedding is based on the concept where words or phrases from the vocabulary are mapped out into vectors of real numbers We use these vectors as a method of arriving at instance population in an ontology. For this purpose, we built an iterative model based on the class representative vector for ontology classes [19].

Ontologies
Word Vector Embeddings
Word Set Expansion
Ontology Population
Domain Specific Semantic Similarity
Semi Supervised Ontology Population
Training word Embeddings
Instances Corpus for Ontology Population
Domain Specific Semantic Similarity Measure
Candidate Model Building
Model Accuracy Measure
Ensemble Model
Findings
CONCLUSION AND FUTURE WORKS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call