Abstract

In this paper, krill herd-based fractional calculus (KH-FC) using MapReduce framework is proposed for effective text document clustering. Here, the stop word removal and stemming model is applied in the pre-processing step, helps to remove redundant information and hence the size of the information is reduced, which further enhances the clustering accuracy. Furthermore, term frequency (TF) and inverse document frequency (IDF) are employed for extracting significant features. Finally, the developed KH-FC model is utilised for clustering the text documents. The developed KH-FC algorithm is developed by combining the FC concept into the KH technique. In this method, pre-processing and feature extraction is performed in the mapper phase, whereas the clustering process is executed in the reducer phase. The performance of the developed approach is evaluated based on performance metrics, like accuracy, Jaccard coefficient, and F-measure. The developed KH-FC approach obtained better performance in terms of accuracy, Jaccard coefficient, and F-measure is 0.983, 0.936 and 0.967, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.