Abstract

Data classification algorithms are often used in the engineering field, but the data measured in the actual engineering often contains different types and degrees of noise, such as vibration noise caused by water flow when measuring the natural frequencies of aqueducts or other hydraulic structures, which will affect the accuracy of classification. In reality, these noises often appear disorganized and stochastic and some existing algorithms exhibit poor performance in the face of these non-Gaussian noise. Therefore, the classification algorithms with excellent performance are needed. To address this issue, a hybrid algorithm of robust principal component analysis (RPCA) combined multigroup random walk random forest (MRWRF) is proposed in this paper. On the one hand RPCA can effectively remove part of non-Gaussian noise, and on the other hand MRWRF can select a better number of decision trees (DTs), which can effectively improve random forest (RF) robustness and classification performance, and the combination of RPCA and MRWRF can effectively classify data with non-Gaussian distribution noise. Compared with other existing algorithms, this hybrid algorithm has strong robustness and preferable classification performance and can thus provide a new approach for data classification problems in engineering.

Highlights

  • Data classification is one of the data mining problems receiving enormous attention [1]; many scholars have carried out relevant research and made great progress in many fields [2,3,4,5], and it is often used in engineering

  • With the improvement of machine learning technology, many scholars have made various attempts to better solve engineering problems. e efforts are mainly divided into two types: (1) some scholars have proposed new data classification algorithms, such as the chaotic salp swarm algorithm [11] and data classification methods based on fuzzy logic [12] and (2) some scholars have improved an existing algorithm, and the improvement usually optimizes the parameters of the existing algorithm or combination two and more algorithms, such as combining the support vector machine (SVM) with the KNN and applying the method to visual category recognition [13]

  • robust principal component analysis (RPCA) can effectively take out part of non-Gaussian noise from raw data, and multigroup random walk random forest (MRWRF) can select a better number of decision trees (DTs), which can effectively improve random forest (RF) robustness and classification performance. e hybrid of RPCA and MRWRF can effectively classify data with nonGaussian distribution noise, and a detailed introduction will be given

Read more

Summary

Introduction

Data classification is one of the data mining problems receiving enormous attention [1]; many scholars have carried out relevant research and made great progress in many fields [2,3,4,5], and it is often used in engineering. Some new noise reduction algorithms are proposed and applied in new engineering fields in recent years, e.g., wavelet packet transform (WPT) is used to extract data from acoustic emission signals containing noisy data [19], noise reduction for desert seismic data using spectral kurtosis adaptive bandpass filter [20], the 1D undecimated discrete wavelet transform (UDWT) has been acquired to attenuate random noise and ground roll [21], a new denoising method was proposed for the simultaneous noise reduction and preservation of seismic signals based on variational mode decomposition (VMD) [22], PCA + linear discriminate analysis (LDA) is first used to extract and denoise the original data, and nearest neighbor (NN) is used to classify the processed data [23]. Erefore, we firstly try to use robust principal component analysis (RPCA) [24] to denoise the signal data containing non-Gaussian noise in engineering field, which has excellent performance in the field of image noise reduction. RPCA can effectively take out part of non-Gaussian noise from raw data, and MRWRF can select a better number of decision trees (DTs), which can effectively improve random forest (RF) robustness and classification performance. e hybrid of RPCA and MRWRF can effectively classify data with nonGaussian distribution noise, and a detailed introduction will be given

Related Work
The Improved Algorithm
Numerical Simulation Case
Algorithm Performance Verification and Result Analysis
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call