Abstract

There is a growing interest in using sparse in situ salinity data to reconstruct high-resolution three-dimensional subsurface salinity with global coverage. However, in areas with no observations, there is a lack of observation data for comparison with reconstructed fields, leading to challenges in assessing the quality and improving the accuracy of the reconstructed data. To address these issues, this study adopted the ‘resampling test’ method to establish the ‘synthetic data’ to test the performance of different machine learning algorithms. The Centre National de Recherches Meteorologiques Climate Model Version 6, and its high-resolution counterpart (CNRM-CM6-1-HR) model data was used. The key advantage of the CNRM-CM6-1-HR is that the true values for salinity are known across the entire ocean at every point in time, and thus we can compare the reconstruction result to this data. The ‘synthetic dataset’ was established by resampling the model data according to the location of in situ observations. This synthetic dataset was then used to prepare two datasets: an ‘original synthetic dataset’ with no noise added to the resampled truth value and a ‘noised synthetic dataset’ with observation error perturbation added to the resampled truth value. The resampled salinity values of the model were taken as the ‘truth values’, and the feed-forward neural network (FFNN) and light gradient boosting machine (LightGBM) approaches were used to design four reconstruction experiments and build multiple sets of reconstruction data. Finally, the advantages and disadvantages of the different reconstruction schemes were compared through multi-dimensional evaluation of the reconstructed data, and the applicability of the FFNN and LightGBM approaches for reconstructing global salinity data from sparse data was discussed. The results showed that the best-performing scheme has low root-mean-square errors (~0.035 psu) and high correlation coefficients (~0.866). The reconstructed dataset from this experiment accurately reflected the geographical pattern and vertical structure of salinity fields, and also performed well on the noised synthetic dataset. This reconstruction scheme has good generalizability and robustness, which indicates its potential as a solution for reconstructing high-resolution subsurface salinity data with global coverage in practical applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call