Abstract

Assessing the predictive accuracy of black box classifiers is challenging in the absence of labeled test datasets. In these scenarios we may need to rely on a human oracle to evaluate individual predictions; presenting the challenge to create query algorithms to guide the search for points that provide the most information about the classifier's predictive characteristics. Previous works have focused on developing utility models and query algorithms for discovering unknown unknowns - misclassifications with a predictive confidence above some arbitrary threshold. However, these methods tend to reward the discovery of misclassifications that occur at the rate indicated by their confidence values. These search methods may reveal nothing more than a correct assessment of predictive certainty, and as a result, we are unable to properly mitigate the risks associated with model deficiency when the model's confidence in prediction exceeds the actual model accuracy. We propose a novel problem formulation to instead search for overconfident unknown unknowns. Specifically, we propose a facility locations utility model and corresponding greedy query algorithm to search for overconfident unknown unknowns. Through robust empirical experiments we demonstrate that the greedy query algorithm with the facility locations utility model outperforms previous methods in discovering overconfident unknown unknowns.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.