Abstract

The Hyperlink-Induced Topic Search (HITS) algorithm developed by Jon Kleinberg made use of the link structure of the web pages on the Web in order to discover and rank web pages being relevant to a particular topic. However it only took account of the hyperlink structure, while completely excluded contents of web pages, and it ignored the fact that degrees of the importance of many hyperlinks on the Web may be different. In this paper, to overcome the topic drifts, we proposed a novel page ranking algorithm combining the hyperlink with the triadic closure theory by considering fully the Vector Space Model (VSM) and the TrustRank algorithm. The method firstly computed the relevance between two randomly arbitrary web pages based on web page topic similarity and common reference degree. Then, by using that model as a point of reference, a new adjacency matrix was constructed to iteratively calculate the authority and hub values of web pages. Next, we calculated the trust-degree for each web page in the basic set by the trust-score algorithm. Finally, the score for each web page is computed by linearly merging the authority and the trust-degree. In our experiments, we used five classic HITS-based algorithms to compare with our proposed page ranking algorithm-PCTHITS (Web Page Topic Similarity, Common Reference Degree, Trust-degree) algorithm. The experimental results demonstrated that our proposed algorithm outperform the other four classic improved algorithms and HITS algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.