Abstract

Abstract Quantitative evaluation of the flatness of the verification rank histogram can be approached through formal hypothesis testing. Traditionally, the familiar χ2 test has been used for this purpose. Recently, two alternatives—the reliability index (RI) and an entropy statistic (Ω)—have been suggested in the literature. This paper presents approximations to the sampling distributions of these latter two rank histogram flatness metrics, and compares the statistical power of tests based on the three statistics, in a controlled setting. The χ2 test is generally most powerful (i.e., most sensitive to violations of the null hypothesis of rank uniformity), although for overdispersed ensembles and small sample sizes, the test based on the entropy statistic Ω is more powerful. The RI-based test is preferred only for unbiased forecasts with small ensembles and very small sample sizes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call