Abstract

Document clustering is a process of text-mining in which documents with similar contents are considered in one cluster while dissimilar documents are considered in other cluster. The number of texts and hypertext documents are growing quickly due to growing speed of WWW and it has become a very challenging task to discover the truly relevant content for some user or purpose due to the huge size, high dynamics and large diversity of the web. There are several web browsers, which use web pages to retrieve information in the form of image, audio, video, text through URLs. There are some URLs, which are used frequently by web users. In this paper, an efficient semantic clustering (ESC) algorithm is proposed in which the number of URLs are clustered together to find larger clusters of most frequent URLs. The ESC algorithm is experimented on two large datasets for semantic clustering. The proposed approach will be useful to recommend most appropriate and relevant URLs to the web users according to their query.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.