Abstract

Distributed peer-to-peer storage systems rely on voluntary participation of peers to effectively manage a storage pool. Files are generally replicated in several sites to provide acceptable levels of availability. If disk space on these peers is not carefully monitored and provisioned, the system may not be able to provide availability for certain files. In particular, identification and elimination of redundant data are important problems that may arise in long-lived systems. Scalability and availability are competing goals in these networks: scalability concerns would dictate aggressive elimination of replicas, while availability considerations would argue conversely. In this paper, the authors provided a novel and efficient solution that addresses both these goals with respect to management of redundant data. Specifically, the problem of duplicate elimination in the context of systems connected over an unstructured peer-to-peer network in which there is no a priori binding between an object and its location was addressed. A new randomized protocol was proposed to solve this problem in a scalable and decentralized fashion that does not compromise availability requirements of the application. Performance results using both large-scale simulations, and a prototype built on PlanetLab, demonstrate that the protocols provide high probabilistic guarantees of success, while incurring minimal administrative overheads.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.