K-Means clustering algorithm is a typical unsupervised learning method, and it has been widely used in the field of fault diagnosis. However, the traditional serial K-Means clustering algorithm is difficult to efficiently and accurately perform clustering analysis on the massive running-state monitoring data of rolling bearing. Therefore, a novel fault diagnosis method of rolling bearing using Spark-based parallel ant colony optimization (ACO)-K-Means clustering algorithm is proposed. Firstly, a Spark-based three-layer wavelet packet decomposition approach is developed to efficiently preprocess the running-state monitoring data to obtain eigenvectors, which are stored in Hadoop Distributed File System (HDFS) and served as the input of ACO-K-Means clustering algorithm. Secondly, ACO-K-Means clustering algorithm suitable for rolling bearing fault diagnosis is proposed to improve the diagnosis accuracy. ACO algorithm is adopted to obtain the global optimal initial clustering centers of K-Means from all eigenvectors, and the K-Means clustering algorithm based on weighted Euclidean distance is used to perform clustering analysis on all eigenvectors to obtain a rolling bearing fault diagnosis model. Thirdly, the efficient parallelization of ACO-K-Means clustering algorithm is implemented on a Spark platform, which can make full use of the computing resources of a cluster to efficiently process large-scale rolling bearing datasets in parallel. Extensive experiments are conducted to verify the effectiveness of the proposed fault diagnosis method. Experimental results show that the proposed method can not only achieve good fault diagnosis accuracy but also provide high model training efficiency and fault diagnosis efficiency in a big data environment.
Read full abstract