An efficient semantic clustering of URLs for web page recommendation

Sanjeev Kumar Sharma,Ugrasen Suman

doi:10.1504/ijdats.2013.058578

Abstract

Document clustering is a process of text-mining in which documents with similar contents are considered in one cluster while dissimilar documents are considered in other cluster. The number of texts and hypertext documents are growing quickly due to growing speed of WWW and it has become a very challenging task to discover the truly relevant content for some user or purpose due to the huge size, high dynamics and large diversity of the web. There are several web browsers, which use web pages to retrieve information in the form of image, audio, video, text through URLs. There are some URLs, which are used frequently by web users. In this paper, an efficient semantic clustering (ESC) algorithm is proposed in which the number of URLs are clustered together to find larger clusters of most frequent URLs. The ESC algorithm is experimented on two large datasets for semantic clustering. The proposed approach will be useful to recommend most appropriate and relevant URLs to the web users according to their query.

Full Text