Abstract

The most frequent approach to data-driven modeling consists to estimate only a single strong predictive model. A different strategy is to build a bucket, or an ensemble of models for some particular learning task. One can consider building a set of weak or relatively weak models like small neural networks, which can be further combined altogether to produce a reliable prediction. The most prominent examples of such machine-learning ensemble techniques are random forests Breiman (Mach Learn 45:5–32, 2001) and neural network ensembles Hansen and Salamon (IEEE Trans Pattern Anal Mach Intell 12:993–1001, 1990), which have found many successful applications in different domains. Liu et al. (Earthquake prediction by RBF neural network ensemble. In: Yin F-L, Wang J, Guo C (eds) Advances in neural networks ISNN2004. Springer, Berlin, p 962–969, 2004) use this approach for predicting earthquakes. Shu and Burn (Water Resour Res 40:1–10, 2004) forecast flood frequencies with an ensemble of networks. We start this chapter by describing the bias-variance decomposition of the prediction error. Next, we discuss how aggregated models and randomized models reduce the prediction error by decreasing the variance term in the bias-variance decomposition. Theoretical developments are inspired from the PhD thesis of Louppe (Understanding random forests: from theory to practice. PhD Dissertation, Faculty of Applied Sciences, Liege University) on random forests, 2014.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call