Accurate energy consumption prediction is crucial for addressing energy scheduling problems. Traditional machine learning models often struggle with small-scale datasets and nonlinear data patterns. To address these challenges, this paper proposes a hybrid grey model based on stacked LSTM layers. This approach leverages neural network structures to enhance feature learning and harnesses the strengths of grey models in handling small-scale data. The model is trained using the Adam algorithm with parameter optimization facilitated by the grid search algorithm. We use the latest annual data on coal, electricity, and gasoline consumption in Henan Province as the application background. The model’s performance is evaluated against nine machine learning models and fifteen grey models based on four performance metrics. Our results show that the proposed model achieves the smallest prediction errors across all four metrics (RMSE, MAE, MAPE, TIC, U1, U2) compared with other 15 grey system models and 9 machine learning models during the testing phase, indicating higher prediction accuracy and stronger generalization performance. Additionally, the study investigates the impact of different LSTM layers on the model’s prediction performance, concluding that while increasing the number of layers initially improves prediction performance, too many layers lead to overfitting.