We formulate a particle and force level, activated dynamics-based statistical mechanical theory for the continuous startup nonlinear shear rheology of ultradense glass-forming hard sphere fluids and colloidal suspensions in the context of the elastically collective nonlinear Langevin equation approach and a generalized Maxwell model constitutive equation. Activated structural relaxation is described as a coupled local-nonlocal event involving caging and longer range collective elasticity which controls the characteristic stress relaxation time. Theoretical predictions for the deformation-induced enhancement of mobility, the onset of relaxation acceleration at remarkably low values of stress, strain, or shear rate, apparent power law thinning of the steady-state structural relaxation time and viscosity, a nonvanishing activation barrier in the shear thinning regime, an apparent Herschel–Buckley form of the shear rate dependence of the steady-state shear stress, exponential growth of different measures of a yield or flow stress with packing fraction, and reduced fragility and dynamic heterogeneity under deformation were previously shown to be in good agreement with experiments. The central new question we address here is the defining feature of the transient response—the stress overshoot. In contrast to the steady-state flow regime, understanding the transient response requires an explicit treatment of the coupled nonequilibrium evolution of structure, elastic modulus, and stress relaxation time. We formulate a new quantitative model for this aspect in a physically motivated and computationally tractable manner. Theoretical predictions for the stress overshoot are shown to be in good agreement with experimental observations in the metastable ultradense regime of hard sphere colloidal suspensions as a function of shear rate and packing fraction, and accounting for deformation-assisted activated motion appears to be crucial for both the transient and steady-state responses.