In the past few years, peer-to-peer (P2P) networks have become a promising paradigm for building a wide variety of distributed systems and applications. The most popular P2P application till today is file sharing, e.g., Gnutella, Kazza, etc. These systems are usually referred to as unstructured, and search in unstructured P2P networks usually involves flooding or random walking. On the other hand, in structured P2P networks (DHTs), search is usually performed by looking up a distributed inverted index. The efficiency of the search mechanism is the key to the scalability of a P2P content sharing system. So far, neither unstructured nor structured P2P networks alone can solve the search problem in a satisfactory way. In this paper, we propose to combine the strengths of both unstructured and structured P2P networks to achieve more efficient search. Specifically, we propose to enhance search in unstructured P2P overlay networks by building a partial index of shared data using a structured P2P network. The index maintains two types of information: the top interests of peers and globally unpopular data, both characterized by data properties. The proposed search protocol, assisted search with partial indexing, makes use of the index to improve search in three ways: first, the index assists peers to find other peers with similar interests and the unstructured search overlay is formed to reflect peer interests. Second, the index also provides search hints for those data difficult to locate by exploring peer interest locality, and these hints can be used for second-chance search. Third, the index helps to locate unpopular data items. Experiments based on a P2P file sharing trace show that the assisted search with a lightweight partial indexing service can significantly improve the success rate in locating data than Gnutella and a hit-rate-based protocol in unstructured P2P systems, while incurring low search latency and overheads.
Read full abstract