This research investigates the comparative predictive efficacy of two leading machine learning methodologies, specifically the XGBoost and Random Forest models, in estimating ocean temperature dynamics in the TS Gulf Stream and Labrador Current regions along the east coast of North America. Using annual temperature datasets and relevant oceanographic parameters, the data is carefully processed, cleaned and sorted into training and test subsets via the RStudio Platform. The performance evaluation model is carried out using predetermined machine learning assessment criteria, including Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and R-squared. The results show the superiority of the XGBoost model compared to Random Forest in terms of prediction accuracy and minimizing prediction errors. The XGBoost model shows lower MSE values and higher R-squared values than the Random Forest model, indicating its better capacity in explaining data variations. XGBoost consistently provides more accurate predictions and shows higher sensitivity in identifying important factors influencing ocean temperature fluctuations than Random Forest. This research significantly improves understanding and prognostic capabilities regarding ocean temperature dynamics in the TS Gulf Stream and Labrador Current regions. Empirical evidence underlines the efficacy of the XGBoost model in predicting ocean temperatures in the studied region. Continuous model evaluation and parameter refinement for both methodologies is critical to establishing standards for optimal prediction performance. The findings of this research have implications for the fields of oceanography and climate science, and offer potential pathways to comprehensively understand and mitigate the impacts of climate change on marine ecosystems.
Read full abstract