Random subsequence forests

Zengyou He,Jiaqi Wang,Mudi Jiang,Lianyu Hu,Quan Zou

doi:10.1016/j.ins.2024.120478

Abstract

The random forest classifier is widely used in different fields due to its accuracy and robustness. Since its invention, the random forest algorithm is naturally developed for multi-dimensional vectorial data since features can be directly sampled during the decision tree construction procedure. In the context of discrete sequence classification, an explicit feature set is not readily available and we need to employ a feature extraction algorithm before building the random forest. However, such a predefined feature subset may limit the diversity of decision trees since the set of candidate features is composed of all subsequences. As a result, the predictive accuracy of constructed random forest classifier may be reduced. To address this, we propose a new algorithm that is able to directly build a random forest by choosing features from the set of all subsequences adaptively. To improve the running efficiency of our algorithm, the count-suffix tree is utilized to facilitate the fast frequency counting of subsequences so as to accelerate the generation of each randomized decision tree. The experimental results on 15 real datasets show that our method can outperform those state-of-the-art classification algorithms in terms of the predictive accuracy. The source code of our method can be found at: https://github.com/JiaqiWang-dlut/RSForest.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Random subsequence forests

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Mar 19, 2024
Citations: 4

Similar Papers

Research on the Classification of High Dimensional Imbalanced Data based on the Optimization of Random Forest Algorithm
Ma Xiaojuan
-
Ma XiaojuanMa Xiaojuan
25 Aug 2018
25 Aug 2018

Decision Tree and Random Forest Classification Algorithms for Mangrove Forest Mapping in Sembilang National Park, Indonesia
Anang Dwi Purwanto ... Ketut Wikantika
Remote Sensing | VOL. 15
Anang Dwi Purwanto, et. al.Anang Dwi Purwanto ... Ketut Wikantika
21 Dec 2022
Remote Sensing | VOL. 15

Classification of Poisonous and Edible Mushrooms with Optimized Classification Algorithms
Sedat Metlek ... Halit Çetiner
International Conference on Applied Engineering and Natural Sciences | VOL. 1
Sedat Metlek, et. al.Sedat Metlek ... Halit Çetiner
20 Jul 2023
International Conference on Applied Engineering and Natural Sciences | VOL. 1

DPRF: A Differential Privacy Protection Random Forest
Jun Hou ... Yaozong Liu
IEEE Access | VOL. 7
Jun Hou, et. al.Jun Hou ... Yaozong Liu
01 Jan 2019
IEEE Access | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Random subsequence forests

Abstract

Talk to us

Similar Papers

More From: Information Sciences