Abstract

The Hadoop platform forms a complete large-scale ecological distribution system, including HDFS, MapReduce, HBase and other subsystems. This paper analyzes the parallel processing of Hadoop platform and applies it in the field of data mining algorithms. In order to obtain better algorithm efficiency, a K-Modes clustering algorithm based on big data platform is proposed. It uses cluster mode to replace the central node. The mining process uses naive Bayes to improve mining efficiency. The experimental results show that it has better adaptability, saves time and improves the efficiency of the algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call