Electrical load forecasting is one of the critical tasks that helps power utility companies in planning and operation as well as the energy managementsystem (EMS) in controlling and optimizing the power grid’s performance. It guarantees system adequacy and reliability by reducing the gap between supply and demand, and it bolsters cybersecurity by enabling false data detection and system security by predicting system peak load. This article aims to build an accurate long-term load forecasting model based on deep learning (DL) algorithms. Initially, big-data analytics and feature engineering are applied to the power load and weather historical data. Next, three different load forecasting models are developed based on three corresponding deep neural network models (DNN): the simple RNN, the long short term memory (LSTM), and the gated recurrent unit (GRU), respectively. To find an accurate DNN model, a two-step hyperparameter optimization is performed consisting of cross-validation. In the first step, the random search (RS) algorithm is employed to narrow the parameter search space to a smaller local one. In the second step, the grid search (GS) algorithm is used to find the best hyperparameters in the local area located by RS. However, the computational burden of the DNN training process is too high and may lead to out-of-memory (OOM) issues, especially when we are dealing with a big volume of data. To overcome this problem, we developed our Python program based on Tensorflow framework and implemented the framework on a high-performance computer cluster (HPCC). Then, different jobs are defined to employ parallel distributed computing (PDC) capability on the SIU BigDawg HPCC. To this end, all DNN hyperparameters (e.g., number of layers, neurons, dropout, batch size, epoch, learning rate, etc.) can be optimized and result in a comprehensive and accurate load forecasting model. Our proposed method, along with the programming framework, is evaluated on a real-world dataset. Among the three models developed and tested, the GRU, with load and weather as features, has demonstrated excellent performance and outperformed the other two models for long-term forecasting with the best metrics (i.e., MSE, MAPE, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$R^2$</tex-math></inline-formula> ) and daily error almost less than 1%. Also, our best model (optimized GRU) is compared with the other reported state-of-the-art models such as DBN, SVR, RM-LSTM, DM-GCNN, TCMS-CNN. It is demonstrated that the proposed framework yields a superior performance than the others.