Abstract
Calculating article similarities enables users to find similar articles and documents in a collection of articles. Two similar documents are extremely helpful for text applications such as document-to-document similarity search, plagiarism checker, text mining for repetition, and text filtering. This paper proposes a new method for calculating the semantic similarities of articles. WordNet is used to find word semantic associations. The proposed technique first compares the similarity of each part two by two. The final results are then calculated based on weighted mean from different parts. Results are compared with human scores to find how it is close to Pearson’s correlation coefficient. The correlation coefficient above 87 percent is the result of the proposed system. The system works precisely in identifying the similarities.
Highlights
With the increasing pace of advancement in information technology in all areas and applications, calculating the similarity of available articles and documents is a subject of considerable debate, and the same is true of designing a system being able to effectively determine the semantic similarities of two documents
The dataset includes 100 articles with their semantic similarities have been scored by some individuals
This paper presents a method for calculating article semantic similarities
Summary
With the increasing pace of advancement in information technology in all areas and applications, calculating the similarity of available articles and documents is a subject of considerable debate, and the same is true of designing a system being able to effectively determine the semantic similarities of two documents. This is more important about scientific articles. This paper presents a method for determining the article similarities. The final results were calculated based on weighted means of different parts.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have