Ensemble Pruning of RF via Multi-Objective TLBO Algorithm and Its Parallelization on Spark

Lanjun Wan,Changyun Li,Zhibing Wang,Xiaojun Deng,Kun Gong,Gen Zhang

doi:10.1109/access.2021.3130905

Lanjun Wan, Changyun Li + Show 4 more

Open Access

https://doi.org/10.1109/access.2021.3130905

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 4	License type: CC BY 4.0

Affiliation: Hunan University of Technology

Abstract

Ensemble learning has been widely used in various fields. Still, too many base classifiers will affect the classification time of the ensemble classifier under the big data environment, while reducing base classifiers will affect the classification accuracy of the ensemble classifier. Therefore, the multi-objective teaching-learning-based optimization (MO-TLBO) algorithm is used to carry out ensemble pruning of random forest (RF) to improve the classification accuracy and speed of RF. MO-TLBO algorithm aims at maximizing classification accuracy and minimizing classification time, and it can find a sub-forest with higher classification accuracy and faster classification speed. In addition, considering the vast computational time of ensemble pruning of RF via MO-TLBO algorithm under the big data environment, a vote set is constructed to improve the fitness evaluation process. In the Spark platform, the RF improved by the MO-TLBO algorithm (MO-TLBO-RF) is parallelized based on data parallelism. The Shuffle optimization strategy is proposed to reduce the number of Shuffles in the execution of parallel MO-TLBO-RF. The proposed MO-TLBO-RF is applied to rolling bearing fault diagnosis. The experimental results show that the algorithm can obtain an RF with high fault diagnosis accuracy and fast fault diagnosis speed. The results also prove that the ensemble pruning time can be greatly reduced via the vote set and parallelization of MO-TLBO-RF.

Highlights

Ensemble learning combines multiple base classifiers to form an ensemble classifier, which has been widely used in biology, transportation, energy, industry, medicine, and other fields [1]–[5]
In order to reduce the enormous computational time of ensemble pruning of random forest (RF) via multi-objective teaching-learning-based optimization (MO-TLBO) algorithm under the big data environment, the RF improved by MOTLBO algorithm is parallelized on Spark according to data parallelism, the Shuffle optimization strategy is proposed, and a vote set is constructed
2) Comparison of Different Swarm Intelligence Optimization Algorithms To evaluate the effectiveness of the MO-TLBO algorithm, three different swarm intelligence optimization algorithms are used for ensemble pruning of RF, i.e., RF improved by multi-objective genetic algorithm (MO-GA-RF), RF improved by multi-objective whale optimization algorithm (MO-WOA-RF), and MO-TLBO-RF

Summary

INTRODUCTION

Ensemble learning combines multiple base classifiers to form an ensemble classifier, which has been widely used in biology, transportation, energy, industry, medicine, and other fields [1]–[5]. The existing researches use multi-objective meta-heuristic algorithms to effectively improve the classification accuracy and reduce the size of the ensemble classifier They do not take the classification time of the ensemble classifier as one goal. In order to reduce the enormous computational time of ensemble pruning of RF via MO-TLBO algorithm under the big data environment, the RF improved by MOTLBO algorithm is parallelized on Spark according to data parallelism, the Shuffle optimization strategy is proposed, and a vote set is constructed. The MO-TLBO algorithm whose two goals are the maximization of classification accuracy and the minimization of classification time is proposed, and a crossover operator with an adaptive crossover rate is designed to better find the best combination of base classifiers.

THE CLASSIC TLBO ALGORITHM

MO-TLBO-RF Spark-RF

PERFORMANCE ANALYSIS OF MODEL TRAINING AND FAULT DIAGNOSIS

19 With the vote set Parallel MO-TLBO-RF

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Ensemble Pruning of RF via Multi-Objective TLBO Algorithm and Its Parallelization on Spark

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm
Lanjun Wan ... Gen Zhang
IEEE Access | VOL. 9
Lanjun Wan, et. al.Lanjun Wan ... Gen Zhang
01 Jan 2020
IEEE Access | VOL. 9

Detection of visual faults in photovoltaic modules using a stacking ensemble approach
Naveen Venkatesh S ... Mohammadreza Aghaei
Heliyon | VOL. 10
Naveen Venkatesh S, et. al.Naveen Venkatesh S ... Mohammadreza Aghaei
01 Mar 2024
Heliyon | VOL. 10

Rolling Bearing Fault Diagnosis Method Based on Parallel QPSO-BPNN Under Spark-GPU Platform
Lanjun Wan ... Junfeng Man
IEEE Access | VOL. 9
Lanjun Wan, et. al.Lanjun Wan ... Junfeng Man
01 Jan 2020
IEEE Access | VOL. 9

Fault Diagnosis of Helical Gearbox through Vibration Signals using Wavelet Features, J48 Decision Tree and Random Forest Classifiers
Ayush Kimothi ... Ameet Singh
Indian Journal of Science and Technology | VOL. 9
Ayush Kimothi, et. al.Ayush Kimothi ... Ameet Singh
14 Sep 2016
Indian Journal of Science and Technology | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ensemble Pruning of RF via Multi-Objective TLBO Algorithm and Its Parallelization on Spark

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access