Extending latent semantic analysis to manage its syntactic blindness

Raja Muhammad Suleman,Ioannis Korkontzelos

doi:10.1016/j.eswa.2020.114130

Raja Muhammad Suleman, Ioannis Korkontzelos

Open Access

https://doi.org/10.1016/j.eswa.2020.114130

Copy DOI

Journal: Expert systems with applications	Publication Date: Oct 24, 2020
Citations: 13	License type: cc-by

Affiliation: Edge Hill University

Abstract

Natural Language Processing (NLP) is the sub-field of Artificial Intelligence that represents and analyses human language automatically. NLP has been employed in many applications, such as information retrieval, information processing and automated answer ranking. Semantic analysis focuses on understanding the meaning of text. Among other proposed approaches, Latent Semantic Analysis (LSA) is a widely used corpus-based approach that evaluates similarity of text based on the semantic relations among words. LSA has been applied successfully in diverse language systems for calculating the semantic similarity of texts. LSA ignores the structure of sentences, i.e., it suffers from a syntactic blindness problem. LSA fails to distinguish between sentences that contain semantically similar words but have opposite meanings. Disregarding sentence structure, LSA cannot differentiate between a sentence and a list of keywords. If the list and the sentence contain similar words, comparing them using LSA would lead to a high similarity score. In this paper, we propose xLSA, an extension of LSA that focuses on the syntactic structure of sentences to overcome the syntactic blindness problem of the original LSA approach. xLSA was tested on sentence pairs that contain similar words but have significantly different meaning. Our results showed that xLSA alleviates the syntactic blindness problem, providing more realistic semantic similarity scores.

Highlights

Natural Language Processing (NLP) is the sub-field of Artificial In telligence that focusses on understanding and generating natural lan guage by machines (Khurana et al, 2017)
Simple Latent Semantic Analysis (LSA) gives a semantic similarity score of 100% to all the sentences that have similar words, irrespectively of the effect they have on the meaning of a sentence. xLSA has been designed to calculate se mantic similarity based on similar words, and on the syn tactic structure of the sentences and the positioning of words in them
This allows xLSA to distinguish between sentences that are semantically related on the surface level, i.e., based on the words that they contain, but convey completely different meaning

Summary

Introduction

Natural Language Processing (NLP) is the sub-field of Artificial In telligence that focusses on understanding and generating natural lan guage by machines (Khurana et al, 2017). Simple string-based metrics only apply in cases of exact word matching They do not consider inflection, synonyms and sentence structure. Latent Semantic Analysis (LSA) is one such technique, allowing to compute the “semantic” overlap between text snippets. LSA considers that words with similar meaning will occur in similar contexts It has been used successfully in a diverse range of NLP applications (Landauer, 2002; Vrana et al, 2018; Wegba et al, 2018; Jirasatjanukul et al, 2019). It has been extensively used as an approximation to human semantic knowledge and verbal intelligence in the context of Intelligent Tutoring Systems (ITS). Hyperspace Analogue to Language (HAL), Pointwise Mutual Information – Information Retrieval (PMI-IR) and Latent Semantic Analysis are some of the most popular corpus-based similarity approaches

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Extending latent semantic analysis to manage its syntactic blindness

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Expert systems with applications

Lead the way for us

Similar Papers

Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis.
Mark Ormerod ... Jesús Martínez Del Rincón
JMIR medical informatics | VOL. 9
Mark Ormerod, et. al.Mark Ormerod ... Jesús Martínez Del Rincón
26 May 2021
JMIR medical informatics | VOL. 9

Arabic Semantic Similarity Approach for Farmers’ Complaints
Rehab Ahmed Farouk ... Mostafa Ali
International Journal of Advanced Computer Science and Applications | VOL. 12
Rehab Ahmed Farouk, et. al.Rehab Ahmed Farouk ... Mostafa Ali
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 12

SCALABLE INFORMATION RETRIEVAL SYSTEM IN SEMANTIC WEB BY QUERY EXPANSION AND ONTOLOGICAL BASED LSA RANKING SIMILARITY MEASUREMENT
Uma Devi M ... Meera Gandhi G
International Journal of Advanced Intelligence Paradigms | VOL. 17
Uma Devi M, et. al.Uma Devi M ... Meera Gandhi G
01 Jan 2020
International Journal of Advanced Intelligence Paradigms | VOL. 17

Causal and Semantic Relatedness in Discourse Understanding and Representation
Michael B W Wolfe ... Benjamin Larsen
Discourse Processes | VOL. 39
Michael B W Wolfe, et. al.Michael B W Wolfe ... Benjamin Larsen
01 May 2005
Discourse Processes | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extending latent semantic analysis to manage its syntactic blindness

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Expert systems with applications