The modern machine learning theory finds application in many areas of human activity. One of the most dispersed tasks is pattern recognition on satellite images. It is difficult for a person to recognize a large number of images in a short time. It made the researchers develop the automation process, such as neural network engagement. The loss function minimization and ensemble learning raise the pattern recognition accuracy. We propose the robust difference gradient positive-negative momentum optimization algorithm that achieves the global minimum of the loss function with higher accuracy and fewer iterations than known analogs. Such an optimization algorithm contains the generalized average moving estimation approach and more effective learning rate control by additional parameters. The proposed optimizer has the regret-bound rate estimation, belonging to OT, and converges to the global minimum. However, the main problems in optimization theory are vanishing and blowing gradient values, where the standard gradient-based algorithms fail to achieve the required objective function value. The vanishing and blowing gradient problems meet in Rastrigin and Rosebrock test functions, where the proposed optimization algorithm attains the global extreme in the shortest number of iterations and has a more stable convergence process than state-of-the-art methods. Afterward, there are trained deep convolutional neural networks with different optimizers on satellite images from the University of California merced dataset containing 21 object classes, where the proposed algorithm gives the highest accuracy. There is a suggested ensemble-learning model consisting of 4 networks with different optimizers. The prediction results receive weight coefficients distributed according to the majority voting and ensemble neural network retrains with the higher pattern recognition accuracy. The suggested ensemble-learning model with the developed optimizer raised the accuracy by 1 %–4 % percentage points.