Recent climate conditions in India have led to numerous disasters, including floods, overflowing rivers, broken dams, and reduced vegetation. Machine learning aims to train models to make predictions based on historical data. Although many studies have focused on predictions and model building, this research introduces a novel integrated framework using a Stacked Heterogeneous Ensemble Model (SHEM). It aims to enhance prediction accuracy by employing another novel approach for imputing missing values, the Variable Specific Hot Deck (VSHD) imputation method. The outcomes were contrasted with established machine learning techniques, including random forest, decision tree, k-nearest neighbor, and support vector machine. After completing the imputation, we proceeded to implement the SHEM model. Real-time climate data for the Cuddalore location in Tamil Nadu state, South India was collected from the NASA Power Access viewer portal to verify the accuracy level of the proposed model. Performance analysis indicates that the proposed imputation outperforms all four alternative models with a 30% average improvement in accuracy. Moreover, the developed SHEM model has a reduced RMSE value of 0.321 and an R-squared value of 0.952 and shows a 9% improvement in accuracy compared to the base model’s performance. Furthermore, these prediction results are compared to an LSTM (Long Short Term Memory), a deep learning model to calculate accuracy and loss, showing that the proposed achieves high accuracy and significant loss during validation. The obtained results can serve as a guideline for atmospheric scientists and various weather forecasting applications, helping them to choose the most appropriate machine learning method for their prediction task.
Read full abstract