Abstract

Distribution regression is the regression case where the input objects are distributions. Many machine learning problems can be analyzed in this framework, such as multi-instance learning and learning from noisy data. This paper attempts to build a conformal predictive system (CPS) for distribution regression, where the prediction of the system for a test input is a cumulative distribution function (CDF) of the corresponding test label. The CDF output by a CPS provides useful information about the test label, as it can estimate the probability of any event related to the label and be transformed to prediction interval and prediction point with the help of the corresponding quantiles. Furthermore, a CPS has the property of validity as the prediction CDFs and the prediction intervals are statistically compatible with the realizations. This property is desired for many risk-sensitive applications, such as weather forecast. To the best of our knowledge, this is the first work to extend the learning framework of CPS to distribution regression problems. We first embed the input distributions to a reproducing kernel Hilbert space using kernel mean embedding approximated by random Fourier features, and then build a fast CPS on the top of the embeddings. While inheriting the property of validity from the learning framework of CPS, our algorithm is simple, easy to implement and fast. The proposed approach is tested on synthetic data sets and can be used to tackle the problem of statistical postprocessing of ensemble forecasts, which demonstrates the effectiveness of our algorithm for distribution regression problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.