Understanding Response Time in the Redundancy-d System

Kristen Gardner,Mor Harchol-Balter,Samuel Zbarsky,Mark Velednitsky,Alan Scheller-Wolf

doi:10.1145/3003977.3003989

Abstract

An increasingly prevalent technique for improving response time in queueing systems is the use of redundancy. In a system with redundant requests, each job that arrives to the system is copied and dispatched to multiple servers. As soon as the first copy completes service, the job is considered complete, and all remaining copies are deleted. A great deal of empirical work has demonstrated that redundancy can significantly reduce response time in systems ranging from Google's BigTable service to kidney transplant waitlists. We propose a theoretical model of redundancy, the Redundancy- d system, in which each job sends redundant copies to d servers chosen uniformly at random. We derive the first exact expressions for mean response time in Redundancy-d systems with any finite number of servers. We also find asymptotically exact expressions for the distribution of response time as the number of servers approaches infinity.

Full Text