Abstract

The web contains enormous amount of information. From that enormous information only small amount of that information is visible to users and a huge portion of the information is not visible to the users. This is because traditional search engines are not able to index or access all information. The information which can be retrieved by following hypertext links are accessed by such traditional search engines. The forms which are not accessed by traditional search engines include login or authorization process. Hidden web refers to that part of the web which is not accessed by traditional web crawlers. An important problem of retrieving desired and good quality of information from huge hidden web database is how to find out and identify the entry points of hidden web databases i.e., forms, in the Web. The traditional web crawlers may be unable to retrieve all information from deep web databases. Therefore it is the main cause of motivation for retrieving information from deep web. Issues and challenges related to the problem are also discussed. An architecture for accessing hidden web databases that uses an intelligent agent technology through reinforcement learning is proposed. The experimental results show that the reinforcement learning helps in overcoming existing problems and outperforms the existing hidden web crawlers in terms of precision and recall.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.