Significant wave height (SWH) is a key parameter for wave energy extraction, ship navigation, oil and gas extraction, coastal structure construction, etc. Direct measurements of SWH using buoys are expensive and accurate over a limited area while numerical models are computationally expensive, inaccurate, with limited generalizability. Thus, this work focuses on developing generalizable machine learning models for predicting SWH from wind parameters (speed, direction, and gust) and atmospheric parameters (temperature and pressure). Two deep learning models (Artificial Neural Network (ANN) and Self Normalizing Neural Network (SNN) and two gradient boosting tree-based models (XGBoost and LightGBM) have been used in this study. Three different data sets were collected from the National Data Buoy Center: Data Set-1 (DS1): 12 years of data from 47 stations; Data Set-2 (DS2): 14 months of additional data from 6 stations randomly selected from DS1; Data Set-3 (DS3): 13 years data from completely 6 new stations. DS1 was split into training, testing, and validation datasets. Training and hyper-parameter tuning was done on the train and validation dataset, while the performance of the models was evaluated on the DS1 test dataset, DS2, and DS3, employing three error metrics: Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-square score (R2). The collected data underwent data cleaning, preprocessing, and exploratory data analysis before modeling. The deep learning models have demonstrated superior fitting capacity to the tree-based models, achieving the lowest MSE (0.047), MAE (0.153), and highest R2 score (0.953) on test data. However, the gradient boosting models demonstrate better generalizing capacity than the deep learning models on DS2 and DS3. This study helps further Sustainable Development Goal 7 (SDG 7) by allowing fast and cheap assessment of wave height for ocean energy site development.
Read full abstract