Abstract

The web consist of Surface web and hidden web. Surface web is also known as publically indexable web. It can be accessed by search engines using hyperlinks present on the pages and using simple keyword matching schemes. Hidden web refers to content that is hidden behind HTML forms. This contains a large collection of data that are unreachable by link-based search engines. A study conducted at University of California, Berkeley estimated that the deep web consists of around 91,000 terabytes of data, whereas the surface web is only about 167 terabytes. The hidden and surface web crawlers return huge result set for the user query. But users commonly look at top ten or twenty results that can be seen without scrolling. Users rarely look at results coming after first response page so ranking of the results is needed. Till now ranking of the web data is a big challenge. Various scholars tried to propose better and efficient techniques for ranking. In this paper, various ranking methods for the hidden web as well as surface web will be explored

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.