Abstract
Traditionally, the data clustering algorithm is lack of comprehensive performance, leading to low clustering purity and long clustering time. In addition, the consistency between the clustering results and the original data distribution is not strong. Therefore, the multidimensional discrete big data clustering algorithm based on dynamic grid was put forward. Firstly, multidimensional discrete big data was processed in advance. The principal component analysis was used to reduce the dimension of data. The concept of entropy was introduced to divide the key attributes and noncritical attributes, so as to extract the key attributes. According to the results of data preprocessing, the dynamic grid was partitioned. According to the results, OptiGrid in data clustering algorithm was used to achieve the data clustering. The experimental results show that the clustering purity of this algorithm is between 95% and 100%, which is significantly higher than the traditional algorithm. Therefore, the multidimensional discrete big data clustering algorithm based on dynamic grid has better comprehensive performance, closer clustering shape to the original data distribution, higher clustering purity, and faster execution efficiency.
Highlights
Due to the shortcomings in above methods, a multidimensional discrete big data clustering algorithm based on dynamic grid was put forward
The results show that the proposed algorithm is effective in solving their own problems, so it has higher comprehensive performance
In order to verify the effectiveness of multidimensional discrete big data clustering algorithm based on dynamic grid, the clustering shape, efficiency, and accuracy of proposed algorithm was compared with the data clustering methods in Reference [2], Reference [3], and Reference [4] through experiments, and the results analysis was given
Summary
With the rapid development of information technology, Internet and cloud computing, the amount of information is increasing explosively. Reference [2] proposed a data clustering method based on K-means algorithm. This method extracted a lot of data samples from massive data. In Reference [3], a data clustering method based on rapid regional evolution was proposed. This method was able to reduce the dimension of data. Due to the shortcomings in above methods, a multidimensional discrete big data clustering algorithm based on dynamic grid was put forward. This algorithm divides the grid in neighborhood of each dimension by the data points, and dynamically adjusts the grid structure. The results show that the proposed algorithm is effective in solving their own problems, so it has higher comprehensive performance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have