Selecting a text similarity measure for a content-based recommender system

Manjula Wijewickrema,Vivien Petras,Naomal Dias

doi:10.1108/el-08-2018-0165

Abstract

PurposeThe purpose of this paper is to develop a journal recommender system, which compares the content similarities between a manuscript and the existing journal articles in two subject corpora (covering the social sciences and medicine). The study examines the appropriateness of three text similarity measures and the impact of numerous aspects of corpus documents on system performance.Design/methodology/approachImplemented three similarity measures one at a time on a journal recommender system with two separate journal corpora. Two distinct samples of test abstracts were classified and evaluated based on the normalized discounted cumulative gain.FindingsThe BM25 similarity measure outperforms both the cosine and unigram language similarity measures overall. The unigram language measure shows the lowest performance. The performance results are significantly different between each pair of similarity measures, while the BM25 and cosine similarity measures are moderately correlated. The cosine similarity achieves better performance for subjects with higher density of technical vocabulary and shorter corpus documents. Moreover, increasing the number of corpus journals in the domain of social sciences achieved better performance for cosine similarity and BM25.Originality/valueThis is the first work related to comparing the suitability of a number of string-based similarity measures with distinct corpora for journal recommender systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Selecting a text similarity measure for a content-based recommender system

Abstract

Talk to us

Similar Papers

More From: The Electronic Library

Lead the way for us

Journal: The Electronic Library	Publication Date: Jul 11, 2019
Citations: 6

Similar Papers

Cosine similarity and distance measures for [formula omitted] quasirung orthopair fuzzy sets: Applications in investment decision-making
Muhammad Rahim ... Thabet Abdeljawad
Heliyon | VOL. 10
Muhammad Rahim, et. al.Muhammad Rahim ... Thabet Abdeljawad
31 May 2024
Heliyon | VOL. 10

Interval-Valued Intuitionistic Fuzzy Ordered Weighted Cosine Similarity Measure and Its Application in Investment Decision-Making
Donghai Liu ... Dan Peng
Complexity | VOL. 2017
Donghai Liu, et. al.Donghai Liu ... Dan Peng
01 Jan 2017
Complexity | VOL. 2017

Some Similarity and Distance Measures between Complex Interval-Valued q-Rung Orthopair Fuzzy Sets Based on Cosine Function and their Applications
Harish Garg ... Tahir Mahmood
Mathematical Problems in Engineering | VOL. 2021
Harish Garg, et. al.Harish Garg ... Tahir Mahmood
27 Apr 2021
Mathematical Problems in Engineering | VOL. 2021

Asymmetrically Weighted Cosine Similarity Measure for Recommendation Systems
Sahil Mishra ... Sanjaya Kumar Panda
-
Sahil Mishra, et. al.Sahil Mishra ... Sanjaya Kumar Panda
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Selecting a text similarity measure for a content-based recommender system

Abstract

Talk to us

Similar Papers

More From: The Electronic Library