Abstract

Anomaly detection has gained considerable attention in the past couple of years. Emerging technologies, such as the Internet of Things (IoT), are known to be among the most critical sources of data streams that produce massive amounts of data continuously from numerous applications. Examining these collected data to detect suspicious events can reduce functional threats and avoid unseen issues that cause downtime in the applications. Due to the dynamic nature of the data stream characteristics, many unresolved problems persist. In the existing literature, methods have been designed and developed to evaluate certain anomalous behaviors in IoT data stream sources. However, there is a lack of comprehensive studies that discuss all the aspects of IoT data processing. Thus, this paper attempts to fill this gap by providing a complete image of various state-of-the-art techniques on the major problems and core challenges in IoT data. The nature of data, anomaly types, learning mode, window model, datasets, and evaluation criteria are also presented. Research challenges related to data evolving, feature-evolving, windowing, ensemble approaches, nature of input data, data complexity and noise, parameters selection, data visualizations, heterogeneity of data, accuracy, and large-scale and high-dimensional data are investigated. Finally, the challenges that require substantial research efforts and future directions are summarized.

Highlights

  • The advent of the Internet has revolutionized communication between humans

  • Since a massive volume of data comes in the form of data streams characterized by some problems, it is necessary to address the challenging issue of detecting anomalies within evolving data streams efficiently

  • Detection has attracted significant attention among researchers in recent years, due to the advancement of sensing technologies categorized with low cost and high impact in diverse application domains

Read more

Summary

Introduction

The advent of the Internet has revolutionized communication between humans. Internet of Things (IoT) devices are reshaping how humans perceive and interact with the physical world. IoT has become one of the biggest data sources in the past few years [4] Methods such as machine learning algorithms can be used to extract meaningful information from these data. Ability to handle fast data—the anomaly detection algorithm must be able to handle and process data in real-time when data points from the data source come constantly, and process data in real-time when data points from the data source come constantly, as data streams could be huge and should be handled in one pass. This paper focuses on machine learning techniques for anomaly detection in data streams, on evolving data streams.

Background
Taxonomy
Machine Learning Techniques
Deep Learning Techniques
Nature of the Data
Anomaly Types
Point Anomaly
Contextual
Collective Anomaly
A Collective series of observations
Supervised Anomaly Detection
Semi-Supervised Anomaly Detection
Unsupervised Anomaly Detection
Window Models
Datasets
Evaluation Criteria
Results
Research Challenges and Future Directions
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.