Abstract

Cloud computing has emerged as real world technology over the Internet. Due to the development of big data with high dimensionality, data storage possibility over cloud has created large scope in recent times. Document clustering is the fundamental topic that turned into an indispensable component in many areas like cloud computing. Document clustering partitions the document into significant classes or groups for retrieving the relevant document. Many researchers used the factorization methods and ontologies for internal and external knowledge based document clustering. However, existing methods failed to provide the semantic feature construction and leads to the information loss while covering all the ideas in documents. In order to address these problems, different document clustering techniques in cloud has been reviewed in this paper. In addition to that Document clustering by Entropy-based Boosting with Projection Neural Feature (EB-PNF) method is presented. The proposed method involves two stages. They are, similar document identification based on semantic similarity score, feature extraction which includes the extraction of both single and multi-label features based on the precision, recall and computational complexity to prove that EB-PNF method produces high-quality clusters comparable to the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call