<p><span lang="EN-IN">The large and wide range of information has become a tough time for crawlers and search engines to extract related information. This paper discusses about focused crawlers also called as topic specific crawler and variations of focused crawlers leading to distributed architecture, i.e., context aware notification architecture. To get the relevant pages from a huge amount of information available in the internet we use the focused crawler. This can bring out the relevant pages for the given topic with less number of searches in a short time. Here the input to the focused crawler is a topic specified using exemplary documents, but not using the keywords. Focused crawlers avoid the searching of all the web documents instead it searches over the links that are relevant to the crawler boundary. The Focused crawling mechanism helps us to save CPU time to large extent to keep the crawl up-to-date.</span></p>
Read full abstract