Over the past decades, different classification approaches with different characteristics have been developed to achieve more efficient and accurate results. Although the loss function used in the training procedure is a significant influential factor in the performance of classification models, it has been less considered. In general, in previous research, two main categories of continuous and semi-continuous distance-based loss functions are often applied to estimate the unknown parameters of classification models. Among these, continuous distance-based cost functions are among the most commonly used and most popular loss functions in diverse statistical and intelligent classifiers. In particular, the fundamental of this category of the loss functions is based on the continuous reduction of the distance between the fitted and actual values with the aim of improving the performance of the model. However, since the goal function of classification models belongs to the class of discrete ones, the application of learning procedures based on a continuous distance-based function is not coordinated with the nature of these problems. Consequently, it is theoretically illogical and practically at least inefficient. Accordingly, in order to fill this research gap, the discrete direction-based loss function in the form of mixed-integer programming is proposed in the training procedure of statistical, shallow/deep intelligent classifiers. In this paper, the impact of the loss function type on the classification rate of the classifiers in the energy domain is investigated. For this purpose, the logistic regression (LR), multilayer perceptron (MLP), and deep multilayer perceptron (DMLP), which are respectively among the most widely used statistical, shallow intelligent, and deep learning classifiers, are exemplarily chosen. Numerical outcomes from 13 benchmark energy datasets show that, in all benchmarks, the performances of the discrete direction learning-based classifiers, i.e., discrete learning-based logistic regression (DILR), discrete learning-based multilayer perceptron (DIMLP), and discrete learning-based deep multilayer perceptron (DIDMLP), is higher than its conventional versions. In addition, the proposed DILR, DIMLP, and DIDMLP models can on average yield an 89.88%, 94.53%, and 96.02% classification rate, which indicate a 6.78%, 5.90%, and 4.69% improvement from the classic versions, which only produce an 84.17%, 89.26%, and 91.72% classification rate. Consequently, the discrete direction-based learning methodology can be a more suitable, effective, and valuable alternative for training processes in statistical and shallow/deep intelligent classification models.