Abstract

High-dimensional and unbalanced data anomaly detection is common. Effective anomaly detection is essential for problem or disaster early warning and maintaining system reliability. A significant research issue related to the data analysis of the sensor is the detection of anomalies. The anomaly detection is essentially an unbalanced sequence binary classification. The data of this type contains characteristics of large scale, high complex computation, unbalanced data distribution, and sequence relationship among data. This paper uses long short-term memory networks (LSTMs) combined with historical sequence data; also, it integrates the synthetic minority oversampling technique (SMOTE) algorithm and K-nearest neighbors (kNN), and it designs and constructs an anomaly detection network model based on kNN-SMOTE-LSTM in accordance with the data characteristic of being unbalanced. This model can continuously filter out and securely generate samples to improve the performance of the model through kNN discriminant classifier and avoid the blindness and limitations of the SMOTE algorithm in generating new samples. The experiments demonstrated that the structured kNN-SMOTE-LSTM model can significantly improve the performance of the unbalanced sequence binary classification.

Highlights

  • With the continuous growth of urban population and wealth accumulation, the urban security has been shown in evidence, while it faces more and more security challenges

  • (3) As the Synthetic minority oversampling technique (SMOTE) algorithm would produce noise data, influencing the determination of classification boundary, we adopt the discriminant classifier based on k-nearest neighbor classifier (kNN) algorithm and the basic classifier based on long short-term memory networks (LSTMs) to screen out the valid samples and remove the noise samples, which can effectively improve the performance and accuracy of classification

  • Considering that the distribution of the wireless sensor data will change with time and new abnormal situation may appear at any time, we adopt the LSTM-based anomaly detection network model to effectively cope with this kind of time-domain sequence data. e unbalanced data distribution of wireless sensor data, which means that abnormal data are only a small portion of all daily monitoring data, leads to the application of the SMOTE algorithm to amplify the data to solve the problem of overfitting caused by unbalanced data

Read more

Summary

Introduction

With the continuous growth of urban population and wealth accumulation, the urban security has been shown in evidence, while it faces more and more security challenges. Wireless sensor network (WSN) is a distributed network architecture consisting of a set of autonomously networked electronic devices (sensor nodes) that collect data from the surrounding environment Such data as current, voltage, power, temperature, humidity, light, and noise will be collected. It is difficult to tell an anomaly in a sensor system from a real anomaly of the sensed environment In this case, the type of wireless sensor network, the detection method, and the interested type of exceptions may trigger a significant impact on the solution design. Is study covers following innovative points as it contains a basic classifier based on LSTM network being the anomaly detection and the structured modeling integration constructing the WSN anomaly detection system. (1) Considering that the distribution of the wireless sensor data will change with time, this study adopts the LSTM-based anomaly detection network model to classify data, which can effectively process timedomain sequence data. (4) e experiment shows that the defects of misclassification of the traditional method can be solved through the model based on basic classifier LSTM, data generator, and discriminant classifier and circulated organic structural fusion can be achieved

Analysis of High-Dimensional and Unbalanced Data Anomaly Detection
Experimental Analysis and Evaluation
Findings
Conclusion and Future Works
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call