Abstract

The massive repository of images of the Sun captured by the Solar Dynamics Observatory (SDO) mission has ushered in the era of Big Data for Solar Physics. In this work, we investigate the entire public collection of events reported to the Heliophysics Event Knowledgebase (HEK) from automated solar feature recognition modules operated by the SDO Feature Finding Team (FFT). With the SDO mission recently surpassing five years of operations, and over 280,000 event reports for seven types of solar phenomena, we present the broadest and most comprehensive large-scale dataset of the SDO FFT modules to date. We also present numerous statistics on these modules, providing valuable contextual information for better understanding and validating of the individual event reports and the entire dataset as a whole. After extensive data cleaning through exploratory data analysis, we highlight several opportunities for knowledge discovery from data (KDD). Through these important prerequisite analyses presented here, the results of KDD from Solar Big Data will be overall more reliable and better understood. As the SDO mission remains operational over the coming years, these datasets will continue to grow in size and value. Future versions of this dataset will be analyzed in the general framework established in this work and maintained publicly online for easy access by the community.

Highlights

  • The era of Big Data is here for Solar Physics

  • The massive repository of images of the Sun captured by the Solar Dynamics Observatory (SDO) mission has ushered in the era of Big Data for Solar Physics

  • We investigate the entire public collection of events reported to the Heliophysics Event Knowledgebase (HEK) from automated solar feature recognition modules operated by the SDO Feature Finding Team (FFT)

Read more

Summary

Introduction

With the Solar Dynamics Observatory (SDO) mission capturing over 150,000 high-resolution full-disk images of the Sun per day (Pesnell et al 2012), never before has there been such a massive volume of solar images available Given this deluge of data that will likely only increase with future missions, it is infeasible to continue traditional brute-force human analysis and labeling of solar phenomena in every image. With an abundance of generated event reports, we are able to analyze large-scale statistics and facilitate knowledge discovery from data (KDD) This better picture of solar phenomena (events) through direct large-scale observations has the unprecedented potential of further advancing scientific understanding and possible predictions of such events and related space weather processes.

Background
The data
Collection
Cleaning checks
Reporting statistics
Spatial and temporal attributes
Event-specific attributes
Null values
Real values
Event sub-types
Dataset dissemination
Data-driven analysis
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call