Contextual anomaly detection framework for big sensor data

Michael A Hayes,Miriam Am Capretz

doi:10.1186/s40537-014-0011-y

Abstract

The ability to detect and process anomalies for Big Data in real-time is a difficult task. The volume and velocity of the data within many systems makes it difficult for typical algorithms to scale and retain their real-time characteristics. The pervasiveness of data combined with the problem that many existing algorithms only consider the content of the data source; e.g. a sensor reading itself without concern for its context, leaves room for potential improvement. The proposed work defines a contextual anomaly detection framework. It is composed of two distinct steps: content detection and context detection. The content detector is used to determine anomalies in real-time, while possibly, and likely, identifying false positives. The context detector is used to prune the output of the content detector, identifying those anomalies which are considered both content and contextually anomalous. The context detector utilizes the concept of profiles, which are groups of similarly grouped data points generated by a multivariate clustering algorithm. The research has been evaluated against two real-world sensor datasets provided by a local company in Brampton, Canada. Additionally, the framework has been evaluated against the open-source Dodgers dataset, available at the UCI machine learning repository, and against the R statistical toolbox.

Highlights

Anomalies are abnormal events or patterns that do not conform to expected events or patterns [1]
In running only the context detector over the entire test dataset, the results showed that there were no context anomalies that were not passed to the content detector
The work presented in the paper describes a novel framework for anomaly detection in Big Data

Summary

Introduction

Anomalies are abnormal events or patterns that do not conform to expected events or patterns [1]. Anomalies are generally categorized into three types: point, or content anomalies; context anomalies, and collective anomalies. Point anomalies occur for data points that are considered abnormal when viewed against the whole dataset. Context anomalies are data points that are considered abnormal when viewed against meta-information associated with the data points. Collective anomalies are data points which are considered anomalies when viewed with other data points, against the rest of the dataset. Detection algorithms can be categorized as point detection, collective detection, or context-aware detection algorithms [1]. Contextual anomalies exist where the dataset includes a combination of behavioural and contextual attributes. These terms are defined as environmental and indicator attributes, as introduced by Song et al [9].

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Big Data	Publication Date: Feb 27, 2015
Citations: 146	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Contextual anomaly detection framework for big sensor data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data

Lead the way for us

Similar Papers

On the Prevalence of Sensor Faults in Real-World Deployments
Abhishek Sharma ... Ramesh Govindan
-
Abhishek Sharma, et. al.Abhishek Sharma ... Ramesh Govindan
01 Jun 2007
01 Jun 2007

Sensor faults
Abhishek B Sharma ... Ramesh Govindan
ACM Transactions on Sensor Networks | VOL. 6
Abhishek B Sharma, et. al.Abhishek B Sharma ... Ramesh Govindan
01 Jun 2010
ACM Transactions on Sensor Networks | VOL. 6

Contextual anomalies in medical data
Daniela Vasco ... Pedro Pereira Rodrigues
-
Daniela Vasco, et. al.Daniela Vasco ... Pedro Pereira Rodrigues
01 Jun 2013
01 Jun 2013

Avoiding Anomalies in Data Stream Learning
João Gama ... Petr Kosina
-
João Gama, et. al.João Gama ... Petr Kosina
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Contextual anomaly detection framework for big sensor data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data