The back-propagation learning rule is modified by using the classical gradient descent algorithm (which uses only a proportional term) with integral and derivative terms of the gradient. The effect of these terms on the convergence behaviour of the objective function is studied and compared with MOM (momentum equation). It is observed that, with an appropriate tuning of the proportional-integral-derivative (PID) parameters, the rate of convergence is greatly improved and the local minima can be overcome. The integral action also helps in locating a minimum quickly. A guideline is presented to appropriately tune the PID parameters and an “integral suppression scheme” is proposed that effectively uses the PID principles, resulting in faster convergence at a desired minimum.