The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator ℛx to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input x as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input x varies within a small range subject to a preset distance constraint, namely ℛx=Py∗−ŷ∗≤εx∗∈Bx, where y∗ denotes the observed target value for the input x∗,ŷ∗ denotes the model prediction for the input x∗, and x∗ is an input in the neighborhood of x subject to the constraint Bx=x∗x∗−x≤δ. The developed MRIP indicator ℛx provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input x and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between x and its MRIP ℛx, thus enabling to provide the reliability estimate ℛx for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to ℛx in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator ℛx thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.
Read full abstract