Abstract

This paper optimizes and parallelizes the C4.5 algorithm on the Spark platform under cloud-edge collaboration. First, the information entropy calculation in the definition of boundary points and Fayyad's theorem is improved by the Gini index, which reduces the number of information entropy calculations for selecting segmentation points in the traditional C4.5 algorithm continuous attribute discretization operation and simplifies the calculation formula, thus reducing the execution time of the algorithm. On this basis, the CFS algorithm is introduced to optimize the calculation of the information gain ratio, to facilitate the selection of better attributes for decision tree partitioning. The improved C4.5 algorithm is then parallelized in Spark platform. In this paper, we choose the “intelligent door guard” of cloud-edge collaboration under the epidemic prevention and control management context as the application scenario for experimental verification and make timely risk assessment for those who want to enter the place and give them a response whether they are allowed to enter or not. Experiments show that the improved parallel C4.5 algorithm reduces the running time and raises the accuracy of the algorithm, compared with the traditional C4.5 algorithm and previous improvements to it. In the application scenario of cloud-edge collaboration, the improved algorithm in this paper decreases the time delay and increases the inspection efficiency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.