Fast LH $$*$$ ∗

Juan Chabkinian,Thomas J E Schwarz Sj

doi:10.1007/s10766-015-0371-8

Abstract

Linear Hashing is an efficient and widely used version of extendible hashing. LH$$*$$ź is its distributed version that stores key-value pairs on up to hundreds of thousands of sites in a distributed system. LH$$*$$ź implements the dictionary data structure efficiently by not using a central component and allows the key-based operations of insertion, deletion, actualization, and retrieval as well as the scan operation. Because it does not use a central addressing component, clients or servers in LH$$*$$ź can commit an addressing error by sending a request to a wrong server. This server then forwards the message to the correct server either directly or in one but never more than one additional forward operation. We discuss here methods to avoid this double forward, which, while rare, still might breach quality of service guarantees. We compare our methods with $$\mathrm{LH}*_{\mathrm{RS}^{\mathrm{\tiny P2P}}}$$LHźRSP2P that pushes information about changes in the file structure to clients, whether they are active or not. A second problem especially relevant in high churn environments such as modern data centers is that sites can suddenly become inaccessible. The various high and scalable reliability versions of LH$$*$$ź then reconstruct the data lost on this site elsewhere. We present a solution to the resulting "wandering bucket" problem that allows clients to find the data at their new location.

Full Text