Abstract
Forecasting extremely rare events is a pressing problem, but efforts to model such outcomes are often limited by the presence of multiple causes within classes of events, insufficient observations of the outcome to assess fit, and biased estimates due to insufficient observations of the outcome. We introduce a novel approach for analyzing rare event data that addresses these challenges by turning attention to the conditions under which rare outcomes do not occur. We detail how configurational methods can be used to identify conditions or sets of conditions that would preclude the occurrence of a rare outcome. Results from Monte Carlo experiments show that our approach can be used to systematically preclude up to 78.6% of observations, and application to ground-truth data coupled with a bootstrap inferential test illustrates how our approach can also yield novel substantive insights that are obscured by standard statistical analyses.
Highlights
Revolutions, economic crashes, and nuclear disasters are examples of extremely rare events [1,2,3]
We introduce a novel approach to analyzing rare event data that addresses these challenges by turning attention to the conditions under which rare outcomes do not occur
We treat rare events as causally asymmetric, such that the conditions leading to the occurrence of the rare event are distinct from the conditions associated with non-occurrence [8]. Leveraging this causal asymmetry, we identify conditions that systematically preclude the non-occurrence of a rare event
Summary
Revolutions, economic crashes, and nuclear disasters are examples of extremely rare events [1,2,3]. The specific causes identified for one instance of a rare outcome may not (and often do not) generalize to other instances [4] Many such events are rare enough that there is insufficient data to appropriately assess the extent to which the observations fit a statistical distribution, creating (often unseen) problems of inference [5]. In such instances where there are numerically few instances where Y = 1, model estimates will be biased toward the zeroes and underestimate the probability of a 1 in finite samples [6,7].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.