Abstract

The Area Under the ROC Curve (AUC) is an important model metric for evaluating binary classifiers, and many algorithms have been proposed to optimize AUC approximately. It raises the question of whether the generally insignificant gains observed by previous studies are due to inherent limitations of the metric or the inadequate quality of optimization. To better understand the value of optimizing for AUC, we present an efficient algorithm, namely AUC-opt, to find the provably optimal AUC linear classifier in R2, which runs in O(n+n- log n+n-) where n+ and n- are the number of positive and negative samples respectively. Furthermore, it can be naturally extended to Rd in O(n+n-d-1 log (n+n-)) by recursively calling AUC-opt in lower-dimensional spaces. We prove the problem is NP-complete when d is not fixed, reducing from the open hemisphere problem. Compared with other methods, experiments show that AUC-opt achieves statistically significant improvements between 17 to 40 in R2 and 4 to 42 in R3 of 50 t-SNE training datasets. However, generally, the gain proves insignificant on most testing datasets compared to the best standard classifiers. Similar observations are found for nonlinear AUC methods under real-world datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.