Abstract

Coronal mass ejections (CMEs), a kind of violent solar eruptive activity, can exert a significant impact on space weather. When arriving at the Earth, they interact with the geomagnetic field, which can boost the energy supply to the geomagnetic field and may further result in geomagnetic storms, thus having potentially catastrophic effects on human activities. Therefore, accurate forecasting of the transit time of CMEs from the Sun to the Earth is vital for mitigating the relevant losses brought by them. XGBoost, an ensemble model that has better performance in some other fields, is applied to the space weather forecast for the first time. During multiple tests with random data splits, the best mean absolute error (MAE) of ∼5.72 hr was obtained, and in this test, 62% of the test CMEs had absolute arrival time error of less than 5.72 hr. The average MAE over all random tests was ∼10 hr. It indicates that our method has a better predictive potential and baseline. Moreover, we introduce two effective feature importance ranking methods. One is the information gain method, a built-in method of ensemble models. The other is the permutation method. These two methods combine the learning process of the model and its performance to rank the CME features, respectively. Compared with the direct correlation analysis on the sample data set, they can help select the important features that closely match the model. These two methods can assist researchers to process large sample data sets, which often require feature selection in advance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call