Abstract

HITS algorithm developed by Jon Kleinberg made use of the link structure of the web network in order to discover and rank pages relevant to a particular topic. But it only took account of the hyperlink structure and completely excluded contents of web pages. Moreover, it ignored the fact that degrees of the importance of many links may be different. Therefore, this algorithm will lead to topic drifts. In this paper, we propose an improved HITS algorithm based on the theory of triadic closure and VSM. This method firstly computes the relevance between arbitrary two pages based on page topic similarity and common reference degree. Then, by using the relevance, a new adjacency matrix is constructed to iteratively calculate authorities and hubs. Preliminary experiments show the new algorithm improves the efficiency and quality of query, reduce the theme drifts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call