Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison

Teemu Säilynoja,Aki Vehtari,Paul-Christian Bürkner

doi:10.1007/s11222-022-10090-6

Teemu Säilynoja, Aki Vehtari + Show 1 more

Open Access

https://doi.org/10.1007/s11222-022-10090-6

Copy DOI

Journal: Statistics and Computing	Publication Date: Mar 24, 2022
Citations: 13	License type: open-access

Affiliation: Aalto University, University of Stuttgart

Abstract

Assessing goodness of fit to a given distribution plays an important role in computational statistics. The probability integral transformation (PIT) can be used to convert the question of whether a given sample originates from a reference distribution into a problem of testing for uniformity. We present new simulation- and optimization-based methods to obtain simultaneous confidence bands for the whole empirical cumulative distribution function (ECDF) of the PIT values under the assumption of uniformity. Simultaneous confidence bands correspond to such confidence intervals at each point that jointly satisfy a desired coverage. These methods can also be applied in cases where the reference distribution is represented only by a finite sample, which is useful, for example, for simulation-based calibration. The confidence bands provide an intuitive ECDF-based graphical test for uniformity, which also provides useful information on the quality of the discrepancy. We further extend the simulation and optimization methods to determine simultaneous confidence bands for testing whether multiple samples come from the same underlying distribution. This multiple sample comparison test is useful, for example, as a complementary diagnostic in multi-chain Markov chain Monte Carlo (MCMC) convergence diagnostics, where most currently used convergence diagnostics provide a single diagnostic value, but do not usually offer insight into the nature of the deviation. We provide numerical experiments to assess the properties of the tests using both simulated and real-world data and give recommendations on their practical application in computational statistics workflows.

Full Text