Abstract

A large part of the data on the World Wide Web resides in the deep web. Most deep web data sources only support simple text interfaces for querying them, which are easy to use but have limited expressive power. Therefore, processing complex structured queries over the deep web currently involves a large amount of manual work. Our work focuses on addressing the existing gap between users' need of expressing and executing complex structured queries over the deep web, and the simple and limited input interfaces of the existing deep web data sources.This paper presents a query planning problem formulation, with novel algorithms and optimizations, for enabling a high-level and highly expressive query language to be supported over deep web data sources. We particularly target three types of complex queries, which are select-project-join queries, aggregation queries, and nested queries. We have developed query planning algorithms to generate query plans for each of these, and propose several optimization techniques to further speedup query plan execution.In our experiments, we show our algorithm has good scalability and furthermore, for over 90% of the experimental queries, the execution time and result quality of the query plans generated by our algorithms are very close to the optimal plans generated by an exhaustive search algorithm. Furthermore, our optimization techniques outperform an existing optimization method in terms of both reduction in transmitted data records and query execution speedups.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.