We present an ab initio theory of the Gilbert damping in substitutionally disordered ferromagnetic alloys. The theory rests on introduced nonlocal torques which replace traditional local torque operators in the well-known torque-correlation formula and which can be formulated within the atomic-sphere approximation. The formalism is sketched in a simple tight-binding model and worked out in detail in the relativistic tight-binding linear muffin-tin orbital (TB-LMTO) method and the coherent potential approximation (CPA). The resulting nonlocal torques are represented by nonrandom, non-site-diagonal and spin-independent matrices, which simplifies the configuration averaging. The CPA-vertex corrections play a crucial role for the internal consistency of the theory and for its exact equivalence to other first-principles approaches based on the random local torques. This equivalence is also illustrated by the calculated Gilbert damping parameters for binary NiFe and FeCo random alloys, for pure iron with a model atomic-level disorder, and for stoichiometric FePt alloys with a varying degree of L10 atomic long-range order.