Abstract
Document clustering is the most needed process in the data mining field where the number of documents with different methodologies are scattered. The meaningful information can be extracted from the group of documents by grouping them effectively. There are various researches that exist previously which concentrate on clustering the documents present in the real. In the previous works, document clustering is done by using the methodologies called the term weight-based hybridised harmony K-means search (TW HHKM), coverage factor-based hybridised harmony K-means search (CF HHKM), concept-based, kernel and weighted feature-based clustering algorithm (CKW HHKM). Clustering is normally done by using the K-means algorithm and the centroids of clusters are found optimally by using the harmony search algorithm. The problem reside in the above said existing methods are the poor accuracy while clustering the documents where the unrelated documents are grouped together. To overcome this problem, multi-view point HHKM (MP HHKM) approach is introduced, in which clustering can be done accurately. In this work, multi-point analysis is done based on the similarity measurement. The exploratory tests were directed on news group and TREC dataset from which it is robust that the proposed technique MP HHKM overtakes the existing technique with better accuracy values.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.