The explosion of online streaming services in the Internet-of-Things (IoT) ecosystem poses new difficulties in detecting anomalies in real-time and continuous data. The IoT data anomalies are classified into long-term and short-term, with unique characteristics that make it difficult to develop a common detection mechanism for both. The existing techniques require excessive training data and suffer from high variability. This paper overcomes these challenges by proposing a Variance profile Exploitation for Anomaly Detection (VEAD) scheme using discrete wavelet transform and k-means clustering. VEAD is initialized by a fast training phase with a single data segment, from which a sensor variance profile is created. This variance profile reflects the degree of deviation in the collected data from different sensors at a specific time period and is continuously updated by integrating new data segments for effective anomaly detection. Overlapping data collection in the detection phase shows correlations among consecutive data segments, leading to improved detection accuracy (ACC). The Intel Berkeley Research Lab dataset with injected synthetic anomalies is used to perform numerical experiments. A comparative performance evaluation with state-of-the-art methods confirms the effectiveness of VEAD in achieving a higher ACC and a lower false-positive rate (FPR). Notably, 95% and 97% ACCs are achieved in detecting long-term and short-term anomalies, respectively. The high specificity of VEAD is also revealed by the low FPR of at most 2% in all cases. The low computational complexity and fast anomaly detection make VEAD suitable for deployment in live systems with massive data streams.
Read full abstract