Abstract

EXtreme Gradient Boosting over Decision Trees (XGBoost or XGBDT) is a powerful tool to model a wide range of processes. We propose a new approach to create a global total electron content model, using machine-learning-based techniques, in particular, gradient boosting. The model is based on the Global Ionospheric Maps computed by Universitat Politecnica de Catalunya with a tomographic-kriging combined technique (UQRG). To reduce the problem complexity, we used empirical orthogonal functions (EOFs). The created model involves the first 16 spatial EOFs. For training and validation we used the 1998–2016 data sets, and the 2017 data as a test data set. To drive the model, we used the following features: (1) geomagnetic activity indexes (Kp, Ap, AE, AU, AL) and solar activity indexes (R, F10.7); (2) derivative values from these indexes such as the mean value and standard deviations within the last 12 h, last 11 days, and last 40 days; (3) day of the year (DOY); (4) averaged EOFs for given Kp and UT, and those for a given DOY and UT. The validation data set revealed the following hyperparameters for XGBoost learning: number of trees is 100, tree depth is 6, and learning rate is 0.1. Comparisons with the NeQuick2, Klobuchar, and GEMTEC models show that machine learning achieves higher accuracy for the 2017 test data set. The global averaged root-mean-square errors and mean absolute percentage errors were about 2.5 TECU and 19% for the nonlinear GIMLi-XGBDT model, about 4 TECU and 30–40% for NeQuick2, GEMTEC, and the linear model GIMLi-LM, and about 5.2 TECU and 73% for the Klobuchar model. A 4-fully-connected-layer artificial neural network provided a higher error (3.28 TECU and 27.7%) as compared to GIMLi-XGBDT. For all models mentioned, the error peaked in the equatorial anomaly region. The solar activity increase does not affect the error of the nonlinear GIMLi-XGBDT model. However, an increase in geomagnetic activity strongly affects that model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call