Abstract
In this paper, we study the post-hoc calibration of modern neural networks, a problem that has drawn a lot of attention in recent years. Despite the plethora of calibration methods proposed, there is no consensus yet on the inherent complexity of the task and, while some authors claim that simple functions solve the problem, others suggest that more expressive models are needed to capture misscalibration. As a first approach, we focus on the task of confidence scaling, specifically on post-hoc methods that generalize Temperature Scaling, which we refer to as the Adaptive Temperature Scaling family. We begin by demonstrating that while complex models like neural networks provide an advantage when there is ample data, they fail in scenarios where it is limited, notably common in fields like medical diagnosis. We then show how under this ideal data conditions the more expressive methods learn a relationship between the entropy of a prediction and its level of overconfidence, and based on this observation, we propose Entropy-based Temperature Scaling, a simple method that scales the confidence of a prediction according to this relationship. Results show that our method obtains state-of-the-art performance and is robust against data scarcity. Moreover, our proposed model enables a deeper understanding of the calibration process by the interpretation of the entropy as a measure of uncertainty in the network outputs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.