Abstract

In this research, a partial least squares (PLS)-based RF with hybrid feature subspace selection is proposed for regression problems. For the problem that average voting strategy of basic RF may decrease method accuracy, PLS is adopted to automatically assign a voting weight to each tree and aggregate the outputs of all trees. To improve feature subspace selection, stratified sampling and embedded feature selection are integrated. First, the variable importance (VI) of each input feature is obtained through embedded feature selection and the features are categorized into two disjointed sets according to VI. During the construction of the trees, stratified sampling is used for feature subspace selection. The effectiveness of PLS aggregation and hybrid feature selection is respectively validated on six regression datasets. The superiority of the proposed RF is demonstrated on historical operation datasets of two power plants through a comparison with five other models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call