Abstract
The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.
Highlights
Time series analysis is a multi-disciplinary field with use cases in astrophysics [12], neurosciences [37], environmental monitoring [35], Internet of things [14], finance [46], asset tracking [29], aviation engineering [7] and many further disciplines
An effective sequential anomaly detection algorithm automatically marks all anomalous, i.e., infrequent subsequences in a time series; thereby the algorithm is robust against reoccurring anomalies, can discover sequential anomalies of arbitrary length and works without training data
We introduce our Distributed Anomaly Detection System (DADS)
Summary
Time series analysis is a multi-disciplinary field with use cases in astrophysics [12], neurosciences [37], environmental monitoring [35], Internet of things [14], finance [46], asset tracking [29], aviation engineering [7] and many further disciplines. Typical time series of terabyte to petabyte size, such as those that often appear in domains like bioinformatics [20] or astrophysics [51], are hardly computable with the current version of S2G For this reason, we introduce our Distributed Anomaly Detection System (DADS). Point anomaly detection is a important tool for the analysis of sensor networks For this special purpose, several time series mining algorithms have been introduced [35]. To identify distance- or density-based outliers, algorithms, such as [44], propose to estimate the data distribution of the underlying time series across the network of sensors and, calculate local outliers to these estimates. An extensive comparison of different algorithms can be found in [45]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.