Abstract
A new data mining technique called MineTool‐TS is introduced which captures the time‐lapse information in multivariate time series data through extraction of global features and metafeatures. This technique is developed into a JAVA‐based data mining software which automates all the steps in the model building to make it more accessible to nonexperts. As its first application in space sciences, MineTool‐TS is used to develop a model for automated detection of flux transfer events (FTEs) at Earth's magnetopause in the Cluster spacecraft time series data. The model classifies a given time series into one of three categories of non‐FTE, magnetosheath FTE, or magnetospheric FTE. One important feature of MineTool‐TS is the ability to explore the importance of each variable or combination of variables as indicators of FTEs. FTEs have traditionally been identified on the basis of their magnetic field signatures, but here we find that some plasma variables can also be effective indicators of FTEs. For example, the perpendicular ion temperature yields a model accuracy of ∼93%, while a model based solely on the normal magnetic field BN yields an accuracy of ∼95%. This opens up the possibility of searching for more unusual FTEs that may, for example, have no clear BN signature and create a more comprehensive and less biased list of FTEs for statistical studies. We also find that models using GSM coordinates yield comparable accuracy to those using boundary normal coordinates. This is useful since there are regions where magnetopause models are not accurate. Another surprising result is the finding that the algorithm can largely detect FTEs, and even distinguish between magnetosheath and magnetospheric FTEs, solely on the basis of models built from single parameters, something that experts may not do so straightforwardly on the basis of short time series intervals. The most accurate models use a combination of plasma and magnetic field variables and achieve a very high accuracy of prediction of ∼99%. We explain the high detection accuracies both in terms of the existence of clear physical signatures of FTEs (for the majority of cases) and in terms of the capability of the data mining technique to explore the data set in a much more thorough fashion than expert human eyes. A list of 1222 FTEs from Cluster data during years 2001–2003 is provided as auxiliary material.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have