This paper investigates the optimal synchronization and disturbance rejection problem for leader-follower multi-agent systems (MASs) subject to constrained input using online actor-critic (A-C) algorithm. Different from the existing works on optimal synchronization problems, the common smoothness assumption on the value function used in the typically reinforcement learning (RL) formulations is not satisfied due to the input constraints. To relax this assumption, the vanishing viscosity method is introduced to construct a more general value function which makes the modified Hamilton–Jacobi-Isaacs (HJI) equation admit a smooth solution. Based on the modified HJI equation, an improved A-C neural networks (NNs) algorithm is developed to find a smooth approximation optimal control solution. Finally, a simulation is performed to demonstrate the effectiveness of the proposed approach.