Abstract

One of the main problems of using the random forests (RF) in classification and regression tasks is a lack of sufficient data which fall into certain leaves of trees in order to estimate the tree predicted values. To cope with this problem, robust imprecise classification and regression RF models, called the imprecise RF, are proposed. They are based on the following ideas. First, imprecision of the tree estimates is taken into account by means of imprecise statistical inference models and confidence interval models. Secondly, we introduce weights assigned to trees or to groups of trees, which are computed in order to correct the RF estimates under condition of imprecise tree predicted values. In fact, the weights can be regarded as a robust meta-learner controlling the imprecision of estimates. Special modifications of loss functions to compute optimal weights for the classification and regression tasks are proposed in order to simplify maximin optimization problems. As a result, simple linear and quadratic optimization problems are obtained, whose solution does not meet any difficulties. Various numerical examples with real datasets illustrate the proposed robust models and show outperforming results when datasets are rather small or noisy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call