A utility-based performance metric for ROC analysis of N-class classification tasks

Darrin C Edwards,Charles E Metz

doi:10.1117/12.710083

Abstract

We have shown previously that an obvious generalization of the area under an ROC curve (AUC) cannot serve as a useful performance metric in classification tasks with more than two classes. We define a new performance metric, grounded in the concept of expected utility familiar from ideal observer decision theory, but which should not suffer from the issues of dimensionality and degeneracy inherent in the hypervolume under the ROC hypersurface in tasks with more than two classes. In the present work, we compare this performance metric with the traditional AUC metric in a variety of two-class tasks. Our numerical studies suggest that the behavior of the proposed performance metric is consistent with that of the AUC performance metric in a wide range of two-class classification tasks, while analytical investigation of three-class "near-guessing" observers supports our claim that the proposed performance metric is well-defined and positive in the limit as the observer's performance approaches that of the guessing observer.

Full Text