FSRF:An Improved Random Forest for Classification

Wenxian Feng,Guozhang Zhao,Rui Zhang,Chenkai Ma

doi:10.1109/aeeca49918.2020.9213456

Wenxian Feng, Guozhang Zhao + Show 2 more

https://doi.org/10.1109/aeeca49918.2020.9213456

Copy DOI

Export

Save

Cite

Publication Date: Aug 1, 2020

Citations: 13

Affiliation: Jilin University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Random forest algorithm is a flexible and easy-to-use machine learning algorithm, which is widely used in classification problems. However, the traditional random forest has some limitations. Because the randomness added by random forest to decision trees almost only occurs in the feature selection when the decision trees are generated, the fixity of decision trees generation rules will lead to relatively serious over fitting. At the same time, in the face of data with high and unbalanced feature dimensions, the performance of algorithm is seriously weakened because high-dimensional data usually contains many irrelevant and redundant features. To solve these problems, we propose an improved random forest algorithm FSRF. Based on the traditional random forest algorithm, we use the feature selection methods to preprocess the data and get the feature subset with the best classification performance to construct the random forest. At the same time, we introduce sparse matrix projection to improve the generation of the random forest. Experiments show that our method reduces the influence of redundant features on classification and improves the accuracy.

Full Text