Text Semantic Annotation: A Distributed Methodology Based on Community Coherence

Christos Makris,Michael Angelos Simos,Georgios Pispirigos

doi:10.3390/a13070160

Abstract

Text annotation is the process of identifying the sense of a textual segment within a given context to a corresponding entity on a concept ontology. As the bag of words paradigm’s limitations become increasingly discernible in modern applications, several information retrieval and artificial intelligence tasks are shifting to semantic representations for addressing the inherent natural language polysemy and homonymy challenges. With extensive application in a broad range of scientific fields, such as digital marketing, bioinformatics, chemical engineering, neuroscience, and social sciences, community detection has attracted great scientific interest. Focusing on linguistics, by aiming to identify groups of densely interconnected subgroups of semantic ontologies, community detection application has proven beneficial in terms of disambiguation improvement and ontology enhancement. In this paper we introduce a novel distributed supervised knowledge-based methodology employing community detection algorithms for text annotation with Wikipedia Entities, establishing the unprecedented concept of community Coherence as a metric for local contextual coherence compatibility. Our experimental evaluation revealed that deeper inference of relatedness and local entity community coherence in the Wikipedia graph bears substantial improvements overall via a focus on accuracy amelioration of less common annotations. The proposed methodology is propitious for wider adoption, attaining robust disambiguation performance.

Highlights

Word Sense disambiguation is the process of identifying the sense of a textual segment of a sentence.The task deterministically yields a unique mapping to an entity, usually drawn from a concept ontology.Semantic ambiguity is a common property of natural language corpora, superficial approaches like the classic bag of words paradigm is often proven insufficient
19.3 million entities interconnected in a link-graph of about billion of Wikipedia at its current size of 19.3 million entities interconnected in a link-graph of1.11 about
In this work we proposed a novel distributed methodology exploring the value of community detection algorithms for the text annotation problem, introducing a novel metric referred to as community coherence

Summary

Introduction

Word Sense disambiguation is the process of identifying the sense of a textual segment of a sentence.The task deterministically yields a unique mapping to an entity, usually drawn from a concept ontology.Semantic ambiguity is a common property of natural language corpora, superficial approaches like the classic bag of words paradigm is often proven insufficient. Many recent approaches focus on employing deep neural network architectures, requiring substantial reduction of the training input dimensionality due to their computational challenges for large-scale datasets. Word Sense Disambiguation (WSD) heavily relies on knowledge at its very core. It would be impossible for both humans and machines to identify the appropriate sense of a polysemous mention within a context without any kind of knowledge. As the manual creation of knowledge resources is an expensive and time-consuming effort, posing copious challenges as new domains and concepts arise or change over time, the matter of knowledge acquisition has been outlined as a prevalent problem in the field of Word Sense Disambiguation

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Jul 1, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Text Semantic Annotation: A Distributed Methodology Based on Community Coherence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Improved text annotation with Wikipedia entities
Christos Makris ... Evangelos Theodoridis
-
Christos Makris, et. al.Christos Makris ... Evangelos Theodoridis
18 Mar 2013
18 Mar 2013

Folksonomy-based ad hoc community detection in online social networks
Vasanth Nair ... Sumeet Dua
Social Network Analysis and Mining | VOL. 2
Vasanth Nair, et. al.Vasanth Nair ... Sumeet Dua
23 Aug 2012
Social Network Analysis and Mining | VOL. 2

Interest-Based Clustering Approach for Social Networks
Lulwah Alsuwaidan ... Mourad Ykhlef
Arabian Journal for Science and Engineering | VOL. 43
Lulwah Alsuwaidan, et. al.Lulwah Alsuwaidan ... Mourad Ykhlef
23 Aug 2017
Arabian Journal for Science and Engineering | VOL. 43

Community Detection
Jelena Grujić ... Miljana Radivojević
-
Jelena Grujić, et. al.Jelena Grujić ... Miljana Radivojević
20 Nov 2023
20 Nov 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text Semantic Annotation: A Distributed Methodology Based on Community Coherence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms