Abstract

Query processing in Web search engines today is mainly performed within a single site or data center, which is required to scale as the Web grows and users require fast answers to their queries. Constraints in the size and cost of data centers, however, may limit the scalability of search engines. Multi-site search engines that perform distributed query processing represent one way to overcome such constraints. Each site processes locally as many queries as possible, keeping latency low without contacting remote sites. Forwarding a query to remote sites depends on the document collection of remote sites. Multi-site search engines pose several new challenges. When a site updates its index, it has to inform other sites. The updates, however, are not instantaneous due to the volume of data exchanged or possible network failures. During the period of time that there are index inconsistencies across sites, queries may not be forwarded optimally. In this work, we investigate the impact of index inconsistencies on a distributed query processing algorithm, when there are index updates, and we observe that delayed index information propagation reduces the effectiveness of query processing, because queries are less likely to be routed optimally.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.