Abstract
ABSTRACTIntroductionDistributed data processing and storage systems require efficient methods to distribute keys across buckets. While simple and fast, the traditional modulo‐based mapping is unstable when the number of buckets changes, leading to spikes in system resource utilization, such as network or database requests. Consistent hash algorithms minimize remappings but are either significantly slower, require floating‐point arithmetic, or are based on a family of hash functions rarely available in standard libraries. This work introduces JumpBackHash, a consistent hash algorithm that overcomes those shortcomings.MethodologyJumpBackHash applies the concept of active indices borrowed from consistent weighted sampling, which inherently leads to consistency. It generates the active indices in reverse order, which avoids floating‐point operations, enables the minimization of consumed random values and the use of a standard pseudorandom generator, and finally leads to a very efficient algorithm.ResultsTheoretical analysis shows that JumpBackHash has an expected constant runtime. The expected value and the variance of the number of consumed random values perfectly agree with the experiments. Empirical tests also confirm the consistency.ConclusionJumpBackHash offers a fast and efficient solution for uniformly distributing keys across buckets in distributed systems. Its simplicity, performance, and the availability of a production‐ready Java implementation as part of the Hash4j open source library make it a viable replacement for the modulo‐based approach for improving assignment and system stability.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.