A Hybrid Approach for Measuring Semantic Similarity between Documents and its Application in Mining the Knowledge Repositories

Ms K,Dr Chidambaram

doi:10.14569/ijacsa.2016.070831

Abstract

This paper explains about similarity measure and the relationship between the knowledge repositories. This paper also describes the significance of document similarity measures, algorithms and to which type of text it can be applied Document similarity measures are of full text similarity, paragraph similarity, sentence similarity, semantic similarity, structural similarity and statistical measures. Two different frameworks had been proposed in this paper, one for measuring document to document similarity and the other model which measures similarity between documents to multiple documents. These two proposed models can use any one of the similarity measures in implementation aspect, which is been put forth for further research.

Highlights

Now-a-days information on the web is increasing rapidly day-by-day
In order to resolve this issue this paper proposes semantic similarity based document retrieval
Several natural language applications such as information retrieval, information recommendation, and machine translation require the similarity between sentences or documents

Summary

INTRODUCTION

Objectives: Now-a-days information on the web is increasing rapidly day-by-day. The increase of web based information and number of internet users’, difficult to find the relevant documents for users to particular needs. Several natural language applications such as information retrieval, information recommendation, and machine translation require the similarity between sentences or documents. Several recent applications of natural language processing demand an effective approach to calculating the similarity between sentences as in [1]. The measure of similarity and relatedness can be extended to many types of entities, such as words, sentences, texts, concepts, or Ontologies depending on the requirement. Tasks such as document classification and clustering, information retrieval, and synonym extraction require precise measurement of semantic similarity between words. As the several applications and domains require semantic similarity, the measurement of sentence / document similarity has greater significance. Calculating semantic similarity among entities has application in several areas such as recommendation systems, e-commerce, search engines, biomedical informatics and in natural language processing tasks such as word sense disambiguation. The short text similarity is important in applications like text summarization as in [6], text categorization as in [7], and machine translation as in [8]

RELATED WORK

Hybrid Approaches

PROPOSED WORK

Jaccard Similarity Coefficient

Hybrid Approach For Measuring Document Similarity Using Ontology And Corpus

POS Tagger Using Hidden Markov Model

Sweto Ontology

RESULT

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2016
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

A Hybrid Approach for Measuring Semantic Similarity between Documents and its Application in Mining the Knowledge Repositories

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Sentence similarity measuring by vector space model
U. L. D. N. Gunasinghe ... W. D. T. P. Premasiri
-
U. L. D. N. Gunasinghe, et. al.U. L. D. N. Gunasinghe ... W. D. T. P. Premasiri
01 Dec 2014
01 Dec 2014

Using Fuzzy Set Similarity in Sentence Similarity Measures
Valerie Cross ... Valeria Mokrenko
-
Valerie Cross, et. al.Valerie Cross ... Valeria Mokrenko
01 Jul 2020
01 Jul 2020

Quantifying semantic similarity of clinical evidence in the biomedical literature to facilitate related evidence synthesis.
Hamed Hassanzadeh ... Karin Verspoor
Journal of Biomedical Informatics | VOL. 100
Hamed Hassanzadeh, et. al.Hamed Hassanzadeh ... Karin Verspoor
30 Oct 2019
Journal of Biomedical Informatics | VOL. 100

FAST: A fuzzy semantic sentence similarity measure
David Chandran ... Zuhair Bandar
-
David Chandran, et. al.David Chandran ... Zuhair Bandar
01 Jul 2013
01 Jul 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Approach for Measuring Semantic Similarity between Documents and its Application in Mining the Knowledge Repositories

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications