Analyzing imbalance among homogeneous index servers in a web search system

C.S Badue,R Baeza-Yates,B Ribeiro-Neto,A Ziviani,N Ziviani

doi:10.1016/j.ipm.2006.09.002

Abstract

The performance of parallel query processing in a cluster of index servers is crucial for modern web search systems. In such a scenario, the response time basically depends on the execution time of the slowest server to generate a partial ranked answer. Previous approaches investigate performance issues in this context using simulation, analytical modeling, experimentation, or a combination of them. Nevertheless, these approaches simply assume balanced execution times among homogeneous servers (by uniformly distributing the document collection among them, for instance)—a scenario that we did not observe in our experimentation. On the contrary, we found that even with a balanced distribution of the document collection among index servers, correlations between the frequency of a term in the query log and the size of its corresponding inverted list lead to imbalances in query execution times at these same servers, because these correlations affect disk caching behavior. Further, the relative sizes of the main memory at each server (with regard to disk space usage) and the number of servers participating in the parallel query processing also affect imbalance of local query execution times. These are relevant findings that have not been reported before and that, we understand, are of interest to the research community.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analyzing imbalance among homogeneous index servers in a web search system

Abstract

Talk to us

Similar Papers

More From: Information Processing and Management

Lead the way for us

Journal: Information Processing and Management	Publication Date: Nov 1, 2006
Citations: 50

Similar Papers

Scalable Data-Intensive Analytics
Meichun Hsu ... Qiming Chen
-
Meichun Hsu, et. al.Meichun Hsu ... Qiming Chen
01 Jan 2009
01 Jan 2009

Modeling performance-driven workload characterization of web search systems
Claudine Badue ... Artur Ziviani
-
Claudine Badue, et. al.Claudine Badue ... Artur Ziviani
01 Jan 2006
01 Jan 2006

Parallel Query Processing on the Grid
...
-
, et. al. ...
10 Jun 2009
10 Jun 2009

Dynamic parallel query processing for distributed objects
Y Jiang ... A Makinou
-
Y Jiang, et. al.Y Jiang ... A Makinou
25 Aug 1998
25 Aug 1998

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analyzing imbalance among homogeneous index servers in a web search system

Abstract

Talk to us

Similar Papers

More From: Information Processing and Management