Abstract

In order to further improve the technical level of data cleaning and data mining and better avoid the defects of uncertain knowledge expression in traditional Bayesian networks, a Bayesian network algorithm based on combined data cleaning and mining technology is proposed, and a manual functional data cleaning architecture based on Hadoop is constructed. The results show that the traditional neighbor sorting algorithm with window size of 5 takes the least time to process the same amount of data. The nearest neighbor sorting algorithm with window size 7 is always the longest. The time consumption of the nonfixed window nearest neighbor sorting algorithm is similar to that of the traditional nearest neighbor sorting algorithm with a window size of 5. However, with the increase of data volume, the consumption time increases rapidly until it approaches the consumption time of the traditional sorting nearest neighbor algorithm with window size of 7. Therefore, the algorithm can improve the precision of data cleaning at the expense of cleaning speed, which proves that the artificial intelligence architecture based on combined data significantly improves the efficiency of the algorithm and can effectively analyze and process large data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call