NGram Approach for Semantic Similarity on Arabic Short Text

Rana Husni Al-Mahmoud,Ahmad Sharieh

doi:10.14569/ijacsa.2022.0131199

Rana Husni Al-Mahmoud, Ahmad Sharieh

Open Access

PDF Available

https://doi.org/10.14569/ijacsa.2022.0131199

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Measuring the semantic similarity between words requires a method that can simulate human thought. The use of computers to quantify and compare semantic similarities has become an important research area in various fields, including artificial intelligence, knowledge management, information re-trieval, and natural language processing. Computational seman-tics require efficient measures for computing concept similarity, which still need to be developed. Several computational measures quantify semantic similarity based on knowledge resources such as the WordNet taxonomy. Several measures based on taxonom-ical parameters have been applied to optimize the expression for content semantics. This paper presents a new similarity measure for quantifying the semantic similarity between concepts, words, sentences, short text, and long text based on NGram features and Synonyms of NGram related to the same domain. The proposed algorithm was tested on 700 tweets, and the semantic similarity values were compared with cosine similarity on the same dataset. The results were analyzed manually by a domain expert who concluded that the values provided by the proposed algorithm were better than the cosine similarity values within the selected domain regarding the semantic similarity between the datasets’ short texts.

Full Text