In this paper, a hierarchical clustering selection based weighted random forests scheme is proposed for fault classification in complex industrial processes. Model diversity and the strength of each model are deemed to be two key issues for the performance of ensemble learning method. To improve the diversity between classification trees and the performance of individual classification trees in random forests, the hierarchical clustering method is applied for offline model selection in random forests, which can simultaneously reduce the online fault classification complexity. Meanwhile, the weighted voting rule is used in random forests instead of majority voting, in order to boost the good performance models and weaken the bad ones. Detailed comparative studies between proposed method and conventional methods have been carried out through the Tennessee Eastman (TE) benchmark process.
Read full abstract