AbstractWe study a class of load-balancing algorithms for many-server systems (N servers). Each server has a buffer of size $b-1$ with $b=O(\sqrt{\log N})$, i.e. a server can have at most one job in service and $b-1$ jobs queued. We focus on the steady-state performance of load-balancing algorithms in the heavy traffic regime such that the load of the system is $\lambda = 1 - \gamma N^{-\alpha}$ for $0<\alpha<0.5$ and $\gamma > 0,$ which we call the sub-Halfin–Whitt regime ($\alpha=0.5$ is the so-called Halfin–Whitt regime). We establish a sufficient condition under which the probability that an incoming job is routed to an idle server is 1 asymptotically (as $N \to \infty$) at steady state. The class of load-balancing algorithms that satisfy the condition includes join-the-shortest-queue, idle-one-first, join-the-idle-queue, and power-of-d-choices with $d\geq \frac{r}{\gamma}N^\alpha\log N$ (r a positive integer). The proof of the main result is based on the framework of Stein’s method. A key contribution is to use a simple generator approximation based on state space collapse.