Abstract

The technical requirements of behavior anomaly detection are higher and higher. Using the Internet of things technology combined with a variety of big data analysis algorithms, we can achieve accurate behavior anomaly detection by classifying behavior data sets to a large extent. In this paper, PLA - PRF (parallel random forest) algorithm is used to realize the behavior anomaly detection model of Internet of things integrating big data analysis. In behavior detection, the PRF algorithm and DFS algorithm are compared in the case of a different number of decision trees. The results show that, compared with DRF algorithm, PLA-PRF, SPARK MLRF(Spark Machine Learning Random Forests) and PRF algorithm perform better on the four datasets, with kappa values increased by about 3.13%, 2.56% and 1.98% respectively. In contrast, PLA-PRF algorithm has higher accuracy in the case of a small sample size. With the increase of sample size, the accuracy of behavior anomaly detection gradually decreases; because the algorithm is in subspace in the process of construction, some high pheromone features are abandoned, which makes the new spatial information of features insufficient, resulting in the decision tree training process does not learn the inherent laws of abandoned data. Compared with spark MLRF and DRF, PLA-PRF has a faster execution speed in large data sets, and with the increase of data volume, the advantage is more prominent. This is because PLA-PRF uses data reuse strategy "DRS" in the process of parallelization, which reduces the data communication overhead in a distributed environment and improves the parallelization efficiency of the algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.