Abstract

With the rapid development of computer technology, the data generated in the scientific research, industrial and commercial fields is increasing at an alarming rate. Traditional data mining techniques are limited to mining a single data source. How to mine distributed data sources and how to perform parallel mining is one of the hot topics in the field of data mining. The purpose of this article is to study distributed data mining in a grid computing environment. This paper studies the existing grid technology and data mining technology, and discusses the possibility of combining the two. Then based on this, a grid-based distributed data mining service framework is proposed, and the service framework is developed detailed design. This paper tests the framework, the experimental results show that applying the grid framework to distributed mining can improve the computing performance and data size. In this paper, the calculation speedup of the framework under 1 to 8 nodes is tested, and the speedup ratios are 1, 2, 3, 4, 5, 6, 7, and 8 respectively. It can be seen that the performance of the framework is directly proportional to the size of the calculation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.