Abstract

Link selection algorithm is an important part in a vertical search engine. Better algorithm leads to better search engine. In order to build a nice vertical search engine, we propose a new kind of link selection algorithm, based on the famous algorithm HITS. The proposed algorithm aims to collect the high topic relevance pages in the specific domain. There are two factors in this algorithm, one is expanded meta-data topic relevance score, which is calculated by combining analysis of link content with hyperlink structure characteristics; the other one is inherit score, which is the influence of father pages' topic relevance score calculated by authority value and hub value. Experiment results indicate that the spider which uses the proposed algorithm can get a high topic relevance search collection. Furthermore it can well avoid the Channel phenomenon and enhance the accuracy of information collection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.