Abstract

SummaryAn efficient search engine needs to be designed in such a way that is able to provide relevant and accurate information in accordance with user needs and interests. The quality of downloaded records can be guaranteed only when website pages of high pertinence are downloaded by the crawlers in accordance with the current topics or user trends. Earlier Focused Crawlers were used to download topic specific pages but these crawlers were not able to adapt to the changing interest of the users. Therefore, there is a need to design crawlers that are able to naturally track the present pattern points and download site pages that meet client's present need. In this paper, a priority assigner and scheduler method for organizing Uniform Resource Locators (URLs) is being proposed that helps the crawler in tracking user's interest and prioritize downloading documents that are relevant to the user's choice in addition to current trends. The experimental results conforms that the proposed priority assigner and URL scheduler‐based crawling outshines conventional crawling strategies based on Change‐history or Site‐Map‐based methods in terms of quality of downloaded web pages and reducing network traffic over the Internet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call