Abstract

Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call