Abstract

This chapter introduces a novel focused WebsiteWebsite crawler to employ the paradigm of focused crawling for the search of relevant Websites. Focused Web crawlers are alternatives to the well-established Web search engines. While the well-known focused crawlers retrieve relevant webpages, there are various applications that target whole Websites instead of single webpages. The focused Web crawler is based on a two-level architecture and corresponding crawl strategies with an explicit concept of Websites. The external crawler views the Web as a graph of linked Websites, selects the Websites to be examined next, and invokes internal crawlers. Each internal crawler views the webpages of a single given Website and performs focused crawling within that Website. The analysis emphasizes that the proposed focused Website crawler clearly outperforms previous methods of focused crawling that were adapted to retrieve Websites instead of single webpages.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.