Abstract
The ROC curve is the gold standard for measuring the performance of a test/scoring statistic regarding its capacity to discriminate between two statistical populations in a wide variety of applications, ranging from anomaly detection in signal processing to information retrieval, through medical diagnosis. Most practical performance measures used in scoring/ranking applications such as the AUC, the local AUC, the p-norm push, the DCG and others, can be viewed as summaries of the ROC curve. In this paper, the fact that most of these empirical criteria can be expressed as two-sample linear rank statistics is highlighted and concentration inequalities for collections of such random variables, referred to as two-sample rank processes here, are proved, when indexed by VC classes of scoring functions. Based on these nonasymptotic bounds, the generalization capacity of empirical maximizers of a wide class of ranking performance criteria is next investigated from a theoretical perspective. It is also supported by empirical evidence through convincing numerical experiments.
Highlights
We analyze the experimental results, by commenting on the test ROC curves obtained after learning the scoring functions, using the early-stopped version of the Algorithm 1 described above, that maximize the chosen Wφ-performance measure: MWW, Pol and RTB
This article argues that two-sample linear rank statistics provide a very flexible and natural class of empirical performance measures for bipartite ranking
We have showed that it encompasses in particular well-known criteria used in medical diagnosis and information retrieval and proved that, in expectation, these criteria are maximized by optimal scoring functions and put the emphasis on gradient ascent method (GA) algorithm’s optimal parameter for the class of scoring functions
Summary
We start with recalling key notions pertaining to ROC analysis and bipartite ranking, which essentially motivates the theoretical analysis carried out in the subsequent section. We recall at length the definition of two-sample linear rank statistics, which have been intensively used to design statistical (homogeneity) testing procedures in the univariate setup, and highlight that many scalar summaries of empirical ROC curves, commonly used as ranking performance criteria, are precisely of this form. The indicator function of any event E is denoted by I{E}, the Dirac mass at any point x by δx, the generalized inverse of any cumulative distribution function W (t) on R ∪ {+∞} by W −1(u) = inf{t ∈] − ∞, +∞] : W (t) ≥ u}, u ∈ [0, 1]. We denote the floor and ceiling functions by u ∈ R → u and by u ∈ R → u respectively
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.