Dynamic Nonparametric Random Forest Using Covariance

Seok-Hwan Choi,Jin-Myeong Shin,Yoon-Ho Choi

doi:10.1155/2019/3984031

Seok-Hwan Choi, Jin-Myeong Shin + Show 1 more

Open Access

https://doi.org/10.1155/2019/3984031

Copy DOI

Export

Save

Cite

Journal: Security and Communication Networks	Publication Date: Mar 27, 2019
Citations: 3	License type: CC BY 4.0

Affiliation: Pusan National University

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

As the representative ensemble machine learning method, the Random Forest (RF) algorithm has widely been used in diverse applications on behalf of the fast learning speed and the high classification accuracy. Research on RF can be classified into two categories: (1) improving the classification accuracy and (2) decreasing the number of trees in a forest. However, most of papers related to the performance improvement of RF have focused on improving the classification accuracy. Only some papers have focused on reducing the number of trees in a forest. In this paper, we propose a new Covariance-Based Dynamic RF algorithm, called C-DRF. Compared to the previous works, while ensuring the good-enough classification accuracy, the proposed C-DRF algorithm reduces the number of trees. Specifically, by computing the covariance between the number of trees in a forest and F-measure at each iteration, the proposed algorithm determines whether to increase the number of trees composing a forest. To evaluate the performance of the proposed C-DRF algorithm, we compared the learning time, the test time, and the memory usage with the original RF algorithm under the different areas of datasets. Under the same or higher classification accuracy, it is shown that the proposed C-DRF algorithm improves the performance of the original RF algorithm by as much as 58.68% at learning time, 47.91% at test time, and 68.06% in memory usage on average. As a practical application area, we also show that the proposed C-DRF algorithm is more efficient than the state-of-the-art RF algorithms in Network Intrusion Detection (NID) area.

Highlights

As one of the classification modeling approaches, decision tree learning has widely been used in various learning fields such as statistics, data mining, and machine learning
We show that the proposed C-DRF algorithm reduces the learning time, the memory usage, and the test time compared to the other Random Forest (RF) algorithms under various applications such as network intrusion detection
By analyzing the covariance between the number of trees in a forest and the F1-measure at each iteration, the proposed CDRF algorithm composed a forest with the minimum number of trees while ensuring the good-enough classification accuracy

Summary

Introduction

As one of the classification modeling approaches, decision tree learning has widely been used in various learning fields such as statistics, data mining, and machine learning. While keeping the best classification accuracy close to that of the original RF algorithm [12], the proposed algorithm reduces the number of trees composing the forest. (1) To the best of our knowledge, we propose the first RF learning algorithm that generates the minimum number of trees using covariance, while keeping the best classification accuracy close to the original RF algorithm. We show that the proposed algorithm reduces the number of trees in a forest while keeping the best accuracy close to the original RF algorithm [12, 19]. We show that the proposed C-DRF algorithm reduces the learning time, the memory usage, and the test time compared to the other RF algorithms under various applications such as network intrusion detection.

Related Works

C-DRF Algorithm

Complexity Analysis

Experimental Evaluation

Findings

Discussion

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Dynamic Nonparametric Random Forest Using Covariance

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Similar Papers

Memory-Efficient Random Forest Generation Method for Network Intrusion Detection
Seok-Hwan Choi ... Yoon-Ho Choi
-
Seok-Hwan Choi, et. al.Seok-Hwan Choi ... Yoon-Ho Choi
01 Jul 2018
01 Jul 2018

Optimizing random forests: spark implementations of random genetic forests
Sikha Bagui ... Timothy Bennett
BOHR International Journal of Engineering | VOL. 1
Sikha Bagui, et. al.Sikha Bagui ... Timothy Bennett
01 Jan 2021
BOHR International Journal of Engineering | VOL. 1

Optimizing Random Forests: Spark Implementations of Random Genetic Forests
Sikha Bagui ... Timothy Bennett
BOHR International Journal of Engineering | VOL. 1
Sikha Bagui, et. al.Sikha Bagui ... Timothy Bennett
01 Jan 2021
BOHR International Journal of Engineering | VOL. 1

Research on the Classification of High Dimensional Imbalanced Data based on the Optimization of Random Forest Algorithm
Ma Xiaojuan
-
Ma XiaojuanMa Xiaojuan
25 Aug 2018
25 Aug 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Dynamic Nonparametric Random Forest Using Covariance

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Security and Communication Networks