Abstract

The accurate prediction of solar flares is crucial due to their risks to astronauts, space equipment, and satellite communication systems. Our research enhances solar flare prediction by employing sophisticated data preprocessing and sampling techniques for the Space Weather Analytics for Solar Flares (SWAN-SF) data set, a rich source of multivariate time series data of solar active regions. Our study adopts a multifaceted approach encompassing four key methodologies. Initially, we address over 10 million missing values in the SWAN-SF data set through our innovative imputation technique called fast Pearson correlation-based k-nearest neighbors imputation. Subsequently, we propose a precise normalization technique, called LSBZM normalization, tailored for time series data, merging various strategies (log, square root, Box–Cox, Z-score, and min–max) to uniformly scale the data set's 24 attributes (photospheric magnetic field parameters), addressing issues such as skewness. We also explore the “near decision boundary sample removal” technique to enhance the classification performance of the data set by effectively resolving the challenge of class overlap. Finally, a pivotal aspect of our research is a thorough evaluation of diverse oversampling and undersampling methods, including SMOTE, ADASYN, Gaussian noise injection, TimeGAN, Tomek links, and random undersampling, to counter the severe imbalance in the SWAN-SF data set, notably a 60:1 ratio of major (X and M) to minor (C, B, and FQ) flaring events in binary classification. To demonstrate the effectiveness of our methods, we use eight classification algorithms, including advanced deep-learning-based architectures. Our analysis shows significant true skill statistic scores, underscoring the importance of data preprocessing and sampling in time-series-based solar flare prediction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.