Abstract

Calculating article similarities enables users to find similar articles and documents in a collection of articles. Two similar documents are extremely helpful for text applications such as document-to-document similarity search, plagiarism checker, text mining for repetition, and text filtering. This paper proposes a new method for calculating the semantic similarities of articles. WordNet is used to find word semantic associations. The proposed technique first compares the similarity of each part two by two. The final results are then calculated based on weighted mean from different parts. Results are compared with human scores to find how it is close to Pearson’s correlation coefficient. The correlation coefficient above 87 percent is the result of the proposed system. The system works precisely in identifying the similarities.

Highlights

  • With the increasing pace of advancement in information technology in all areas and applications, calculating the similarity of available articles and documents is a subject of considerable debate, and the same is true of designing a system being able to effectively determine the semantic similarities of two documents

  • The dataset includes 100 articles with their semantic similarities have been scored by some individuals

  • This paper presents a method for calculating article semantic similarities

Read more

Summary

Introduction

With the increasing pace of advancement in information technology in all areas and applications, calculating the similarity of available articles and documents is a subject of considerable debate, and the same is true of designing a system being able to effectively determine the semantic similarities of two documents. This is more important about scientific articles. This paper presents a method for determining the article similarities. The final results were calculated based on weighted means of different parts.

Literature review
The proposed method
Text preprocessing tool
WordNet
Calculating sentence semantic similarities between two texts
Calculating word semantic similarities between two sentences
Calculating semantic similarities between articles
Evaluation and results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call