Comparison of validation variants by sum of ranking differences and ANOVA

Károly Héberger,Klára Kollár‐Hunek

doi:10.1002/cem.3104

Abstract

AbstractThe old debate is revived: Definite differences can be observed in suggestions of estimation for prediction performances of models and for validation variants according to the various scientific disciplines. However, the best and/or recommended practice for the same data set cannot be dependent on the field of usage. Fortunately, there is a method comparison algorithm, which can rank and group the validation variants; its combination with variance analysis will reveal whether the differences are significant or merely the play of random errors. Therefore, three case studies have been selected carefully to reveal similarities and differences in validation variants. The case studies illustrate the different significance of these variants well. In special circumstances, any of the influential factors for validation variants can exert significant influence on evaluation by sums of (absolute) ranking differences (SRDs): stratified (contiguous block) or repeated Monte Carlo resampling and how many times the data set is split (5‐7‐10). The optimal validation variant should be determined individually again and again. A random resampling with sevenfold cross‐validations seems to be a good compromise to diminish the bias and variance alike. If the data structure is unknown, a randomization of rows is suggested before SRD analysis. On the other hand, the differences in classifiers, validation schemes, and models proved to be always significant, and even subtle differences can be detected reliably using SRD and analysis of variance (ANOVA).

Full Text