Abstract

Aiming at the abnormal data behavior such as huge amount of data and easy to be stolen or lost in the process of distributed cloud computing in cloud storage environment, an abnormal data mining and detection algorithm of MapReduce based on Hadoop distributed file system (HDFS) and deep neural network is proposed. Firstly, the algorithm analyzes the MAC timestamp characteristics generated by HDFS folder replication, establishes the detection and measurement methods of replication behavior, and ensures that all the patterns that lead to data anomalies, including theft, packet loss and malicious attack, can be detected. Secondly, the algorithm combines deep neural network to design a task partition strategy suitable for arbitrary MapReduce data, and records the input dataset of HDFS hierarchical relationship. Finally, combined with the parallel processing ability of MapReduce, the efficient analysis of massive timestamp data is realized by designing the dataset and algorithm execution scheme suitable for MapReduce task partition. The experimental results show that the algorithm can control the missed detection rate and the number of false detection folders through the segmentation detection strategy. Compared with the existing big data anomaly detection method, the algorithm has higher execution efficiency and good scalability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.