Abstract
Regression models for supervised learning problems with a continuous response are commonly understood as models for the conditional mean of the response given predictors. This notion is simple and therefore appealing for interpretation and visualization. Information about the whole underlying conditional distribution is, however, not available from these models. A more general understanding of regression models as models for conditional distributions allows much broader inference, for example, the computation of prediction intervals or probabilistic predictions for exceeding certain thresholds. Several random forest-type algorithms aim at estimating conditional distributions, most prominently quantile regression forests. We propose a novel approach based on a parametric family of distributions characterized by their transformation function. A dedicated novel “transformation tree” algorithm able to detect distributional changes is developed. Based on these transformation trees, we introduce “transformation forests” as an adaptive local likelihood estimator of conditional distribution functions. The resulting predictive distributions are fully parametric yet very general and allow inference procedures, such as likelihood-based variable importances, to be applied in a straightforward way. Supplemental files for this article are available online.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.