Knowledge-based sentence semantic similarity: algebraical properties

Mourad Oussalah,Muhidin Mohamed

doi:10.1007/s13748-021-00248-0

Mourad Oussalah, Muhidin Mohamed

Open Access

PDF Available

https://doi.org/10.1007/s13748-021-00248-0

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Determining the extent to which two text snippets are semantically equivalent is a well-researched topic in the areas of natural language processing, information retrieval and text summarization. The sentence-to-sentence similarity scoring is extensively used in both generic and query-based summarization of documents as a significance or a similarity indicator. Nevertheless, most of these applications utilize the concept of semantic similarity measure only as a tool, without paying importance to the inherent properties of such tools that ultimately restrict the scope and technical soundness of the underlined applications. This paper aims to contribute to fill in this gap. It investigates three popular WordNet hierarchical semantic similarity measures, namely path-length, Wu and Palmer and Leacock and Chodorow, from both algebraical and intuitive properties, highlighting their inherent limitations and theoretical constraints. We have especially examined properties related to range and scope of the semantic similarity score, incremental monotonicity evolution, monotonicity with respect to hyponymy/hypernymy relationship as well as a set of interactive properties. Extension from word semantic similarity to sentence similarity has also been investigated using a pairwise canonical extension. Properties of the underlined sentence-to-sentence similarity are examined and scrutinized. Next, to overcome inherent limitations of WordNet semantic similarity in terms of accounting for various Part-of-Speech word categories, a WordNet “All word-To-Noun conversion” that makes use of Categorial Variation Database (CatVar) is put forward and evaluated using a publicly available dataset with a comparison with some state-of-the-art methods. The finding demonstrates the feasibility of the proposal and opens up new opportunities in information retrieval and natural language processing tasks.

Highlights

Measures of semantic similarities have been primarily developed for quantifying the extent of resemblance between two words or two concepts using pre-existing resources that encode word-to-word or concept-to-concept relationships as in WordNet lexical database [1, 2].Accurate comparison between text snippets for the similarity determination is a fundamental prerequisite in the areas of natural language processing, information retrieval, text summarization, document clustering, question answering, automatic essay scoring and others [3, 4]
– WN corresponds to the case where the sentence similarity is calculated using the canonical extension (26) with Wu and Palmer word-to-word semantic similarity measure
– WNwC corresponds to the case where the semantic similarity is calculated using the canonical extension (26) with Wu and Palmer word-to-word semantic similarity measure after performing the “all-to-noun” conversation

Summary

Introduction

Accurate comparison between text snippets for the similarity determination is a fundamental prerequisite in the areas of natural language processing, information retrieval, text summarization, document clustering, question answering, automatic essay scoring and others [3, 4]. The quantification of the similarity between candidate sentences can allow us to promote a good summary coverage and prevent redundancy in automatic text summarization [5]. In this respect, the similarity values of sentence pairs are sometimes used as part of the statistical features of the text summarization system. Query-based summarization crucially relies on similarity scores for summary extraction [7]. Among other text similarity applications, one shall mention machine translation [13], text classification [3, 4, 14], database where similarity is used for schema matching [15], and bioinformatics [16]

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Progress in Artificial Intelligence	Publication Date: Aug 21, 2021
Citations: 7	License type: open-access

R Discovery Prime

Knowledge-based sentence semantic similarity: algebraical properties

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Progress in Artificial Intelligence

Lead the way for us

Similar Papers

Using ontology for measuring semantic similarity for question answering system
Muthukrishnan Ramprasath ... Shanmugasundaram Hariharan
-
Muthukrishnan Ramprasath, et. al.Muthukrishnan Ramprasath ... Shanmugasundaram Hariharan
01 Aug 2012
01 Aug 2012

Identification of New Parameters for Ontology Based Semantic Similarity Measures
Shivani Jain ... Seeja K.R
ICST Transactions on Scalable Information Systems | VOL. 0
Shivani Jain, et. al.Shivani Jain ... Seeja K.R
13 Jul 2018
ICST Transactions on Scalable Information Systems | VOL. 0

Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics -
-
-
--
01 Jan 1999
01 Jan 1999

Graph-based Natural Language Processing and Information Retrieval
Rada Mihalcea ... Dragomir Radev
-
Rada Mihalcea, et. al.Rada Mihalcea ... Dragomir Radev
11 Apr 2011
11 Apr 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Knowledge-based sentence semantic similarity: algebraical properties

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Progress in Artificial Intelligence