Query-driven indexing for scalable peer-to-peer text retrieval

Gleb Skobeltsyn,Toan Luu,Ivana Podnar Žarko,Martin Rajman,Karl Aberer

doi:10.1016/j.future.2008.03.006

Abstract

In this paper, we present a query-driven indexing/retrieval strategy for efficient full text retrieval from large document collections distributed within a structured P2P network. Our indexing strategy is based on two important properties: (1) the generated distributed index stores posting lists for carefully chosen indexing term combinations that are frequently present in user queries, and (2) the posting lists containing too many document references are truncated to a bounded number of their top-ranked elements. These two properties guarantee acceptable latency and bandwidth requirements, essentially because the number of indexing term combinations remains scalable and the posting lists transmitted during retrieval never exceed a constant size. A novel index update mechanism efficiently handles adding of new documents to the document collection. Thus, the generated distributed index corresponds to a constantly evolving query-driven indexing structure that efficiently follows current information needs of the users and changes in the document collection. We show that the size of the index and the generated indexing/retrieval traffic remains manageable even for Web-size document collections at the price of a marginal loss in precision for rare queries. Our theoretical analysis and experimental results provide convincing evidence about the feasibility of the query-driven indexing strategy for large scale P2P text retrieval.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Query-driven indexing for scalable peer-to-peer text retrieval

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Journal: Future Generation Computer Systems	Publication Date: Jun 5, 2008
Citations: 45

Similar Papers

Web text retrieval with a P2P query-driven index
Gleb Skobeltsyn ... Ivana Podnar Zarko
-
Gleb Skobeltsyn, et. al.Gleb Skobeltsyn ... Ivana Podnar Zarko
23 Jul 2007
23 Jul 2007

Query-driven indexing for scalable peer-to-peer text retrieval
...
-
, et. al. ...
06 Jun 2007
06 Jun 2007

Query-driven indexing for peer-to-peer text retrieval
Gleb Skobeltsyn ... Martin Rajman
-
Gleb Skobeltsyn, et. al.Gleb Skobeltsyn ... Martin Rajman
08 May 2007
08 May 2007

A dummy-based user privacy protection approach for text information retrieval
Zongda Wu ... Enhong Chen
Knowledge-Based Systems | VOL. 195
Zongda Wu, et. al.Zongda Wu ... Enhong Chen
24 Feb 2020
Knowledge-Based Systems | VOL. 195

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Query-driven indexing for scalable peer-to-peer text retrieval

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems