Abstract

The ability to characterize and predict extreme events is a vital topic in fields ranging from finance to ocean engineering. Typically, the most-extreme events are also the most-rare, and it is this property that makes data collection and direct simulation challenging. We consider the problem of deriving optimal predictors of extremes directly from data characterizing a complex system, by formulating the problem in the context of binary classification. Specifically, we assume that a training dataset consists of: (i) indicator time series specifying on whether or not an extreme event occurs; and (ii) observables time series, which are employed to formulate efficient predictors. We employ and assess standard binary classification criteria for the selection of optimal predictors, such as total and balanced error and area under the curve, in the context of extreme event prediction. For physical systems for which there is sufficient separation between the extreme and regular events, i.e., extremes are distinguishably larger compared with regular events, we prove the existence of optimal extreme event thresholds that lead to efficient predictors. Moreover, motivated by the special character of extreme events, i.e., the very low rate of occurrence, we formulate a new objective function for the selection of predictors. This objective is constructed from the same principles as receiver operating characteristic curves, and exhibits a geometric connection to the regime separation property. We demonstrate the application of the new selection criterion to the advance prediction of intermittent extreme events in two challenging complex systems: the Majda–McLaughlin–Tabak model, a 1D nonlinear, dispersive wave model, and the 2D Kolmogorov flow model, which exhibits extreme dissipation events.

Highlights

  • Many phenomena in a wide range of physical domains and engineering applications have observable properties that are normally distributed, that is, they obey Gaussian statistics.Gaussian-distributed random variables and processes are easy to manipulate algebraically, and there is a rich literature using their properties in widely varying areas of probability and statistics, from Bayesian regression [1] to stochastic differential equations [2]

  • Before we proceed to the definition of a measure that explicitly takes into account the rare event character of extreme events, it is important to study some properties of the precision–recall–rate surface

  • We have formulated a method for optimizing extreme event prediction in an equation free manner, i.e., using only data

Read more

Summary

Introduction

Many phenomena in a wide range of physical domains and engineering applications have observable properties that are normally distributed, that is, they obey Gaussian statistics. We first discuss limitations of standard methods from binary classification, in the context of extreme event precursors Motivated by these limitations, we design a machine learning approach to compute optimal precursors to extreme events (“predictors”) directly from data, taking into account the most important aspect of extreme events: their rare character. We design a machine learning approach to compute optimal precursors to extreme events (“predictors”) directly from data, taking into account the most important aspect of extreme events: their rare character This approach will naturally suggest a frontier for trade-offs between false positive and negative error rates, and in many cases will geometrically identify an optimal threshold to separate extreme and quiescent events. Machine-learned predictors support previous analysis on the mechanisms of intermittency in those systems

A Critical Overview of Binary Classification Methods
Total and Balanced Error Rate
Area under the Precision–Recall Curve
Volume under the Precision–Recall–Rate Surface
Separation of Extreme and Quiescent Events
A Predictor Selection Criterion Adjusted for Extreme Events
Coinflip Indicator—Predictor
Bimodal Indicator—Predictor
Gaussian Indicator—Predictor
Applications
Numerical Results
The Kolmogorov Flow Model
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call