Abstract
Background: Clustering is one of the important techniques in Data Mining to group the related data. Clustering can be applied on numerical data as well as web objects such as URLs, websites, documents, keywords etc. which is the building block for many recommender systems as well as prediction models. Objective: The objective of this research article is to develop an optimal clustering approach which considers semantics of web objects to cluster them in a group. More so importantly, the purpose of the proposed work is to strictly improve the computation time of clustering process. Methods: In order to achieve the desired objectives, following two contributions have been proposed to improve the clustering approach 1) Semantic Similarity Measure based on Wu-Palmer Semantics based similarity 2). Two-Level Densitybased Clustering technique to reduce the computational complexity of density based clustering approach. Results: The efficacy of the proposed method has been analyzed on AOL search logs containing 20 million web queries. The results showed that our approach increases the F-measure, and decreases the entropy. It also reduces the computational complexity and provides a competitive alternative strategy of semantic clustering when conventional methods do not provide helpful suggestions. Conclusion: A clustering model has been proposed which is composed of two components i.e. Similarity measure and Density based two-level clustering technique. The proposed model reduced the time cost of density based clustering approach without effecting the performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Recent Advances in Computer Science and Communications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.