A semantic approach for text document clustering using frequent itemsets and WordNet

Harsha Patil,Ramjeevan Singh Thakur

doi:10.14419/ijet.v7i2.9.10220

Abstract

Document Clustering is an unsupervised method for classified documents in clusters on the basis of their similarity. Any document get it place in any specific cluster, on the basis of membership score, which calculated through membership function. But many of the traditional clustering algorithms are generally based on only BOW (Bag of Words), which ignores the semantic similarity between document and Cluster. In this research we consider the semantic association between cluster and text document during the calculation of membership score of any document for any specific cluster. Several researchers are working on semantic aspects of document clustering to develop clustering performance. Many external knowledge bases like WordNet, Wikipedia, Lucene etc. are utilized for this purpose. The proposed approach exploits WordNet to improve cluster member ship function. The experimental result shows that clustering quality improved significantly by using proposed framework of semantic approach.

Highlights

A days to solve any query, search engine is very useful and instant tool
Many of the traditional clustering algorithms are mostly based on only BOW, which ignores the semantic similarity between document and Cluster
Text documents are normally full of abstract concepts, which difficult to represent by using traditional methodology of text mining

Summary

Introduction

A days to solve any query, search engine is very useful and instant tool. Internet is fastest method to learn, understand and solve any problem or get any information from worldwide knowledge base. All search engines are using document clustering to display query results in organized and in effective manner. Many of the traditional clustering algorithms are mostly based on only BOW, which ignores the semantic similarity between document and Cluster. Due to lacking of this, traditional document clustering algorithms are not capable to present semantic associations among the words and penalties in less qualitative output. Use of external knowledge base is being very helpful to develop semantic based approaches for document clustering. The use of WordNet in clustering captures the relations between the words and help to identify the precise cluster of the documents.

Related works

WordNet

Experimental evaluation

Conclusion and future work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Engineering & Technology	Publication Date: Jun 1, 2018
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

A semantic approach for text document clustering using frequent itemsets and WordNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering & Technology

Lead the way for us

Similar Papers

A survey of document clustering using semantic approach
Nagma Y Saiyad ... Vipul K Dabhi
-
Nagma Y Saiyad, et. al.Nagma Y Saiyad ... Vipul K Dabhi
01 Mar 2016
01 Mar 2016

Analysis of Similarity Measures with WordNet Based Text Document Clustering
Nadella Sandhya ... A Govardhan
-
Nadella Sandhya, et. al.Nadella Sandhya ... A Govardhan
01 Jan 2012
01 Jan 2012

Combined Chi-Square with k-Means for Document Clustering
Ammar Ismael Kadhim ... Abood Kirebut Jassim
IOP Conference Series: Materials Science and Engineering | VOL. 1076
Ammar Ismael Kadhim, et. al.Ammar Ismael Kadhim ... Abood Kirebut Jassim
01 Feb 2021
IOP Conference Series: Materials Science and Engineering | VOL. 1076

Enhanced phrase-based document clustering using Self-Organizing Map (SOM) architectures
M Hussin ... M Kamel
-
M Hussin, et. al.M Hussin ... M Kamel
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A semantic approach for text document clustering using frequent itemsets and WordNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering &amp; Technology

More From: International Journal of Engineering & Technology