Discord Discovery Research Articles

Recognition of anomalous events is a challenging but critical task in many scientific and industrial fields, especially when the properties of anomalies are unknown. In this paper, we introduce a new anomaly concept called “unicorn” or unique event and present a new, model-free, unsupervised detection algorithm to detect unicorns. The key component of the new algorithm is the Temporal Outlier Factor (TOF) to measure the uniqueness of events in continuous data sets from dynamic systems. The concept of unique events differs significantly from traditional outliers in many aspects: while repetitive outliers are no longer unique events, a unique event is not necessarily an outlier; it does not necessarily fall out from the distribution of normal activity. The performance of our algorithm was examined in recognizing unique events on different types of simulated data sets with anomalies and it was compared with the Local Outlier Factor (LOF) and discord discovery algorithms. TOF had superior performance compared to LOF and discord detection algorithms even in recognizing traditional outliers and it also detected unique events that those did not. The benefits of the unicorn concept and the new detection method were illustrated by example data sets from very different scientific fields. Our algorithm successfully retrieved unique events in those cases where they were already known such as the gravitational waves of a binary black hole merger on LIGO detector data and the signs of respiratory failure on ECG data series. Furthermore, unique events were found on the LIBOR data set of the last 30 years.

Read full abstract

Диссонанс является уточнением понятия аномальной подпоследовательности (существенно непохожей на остальные подпоследовательности) временного ряда. Задача поиска диссонанса встречается в широком спектре предметных областей, связанных с временными рядами: медицина, экономика, моделирование климата и др. В работе предложен новый параллельный алгоритм поиска диссонанса во временном ряде на платформе многоядерного ускорителя для случая, когда входные данные могут быть размещены в оперативной памяти. Алгоритм использует возможность независимого вычисления евклидовых расстояний между подпоследовательностями ряда. Алгоритм состоит из двух этапов: подготовка данных и поиск. На этапе подготовки выполняется построение вспомогательных матричных структур данных, обеспечивающих распараллеливание и векторизацию вычислений. На стадии поиска осуществляется нахождение диссонанса с помощью построенных структур данных. Выполнена реализация алгоритма для ускорителей архитектур Intel MIC (Many Integrated Core) и NVIDIA GPU, распараллеливание выполнено с помощью технологий программирования OpenMP и OpenAcc соответственно. Представлены результаты вычислительных экспериментов, подтверждающих масштабируемость разработанного алгоритма. Discord is a refinement of the concept of anomalous subsequence of a time series. The discord discovery problem frequently occurs in a wide range of application areas related to time series: medicine, economics, climate modeling, etc. In this paper we propose a new parallel discord discovery algorithm for many-core systems in the case when the input data fit in the main memory. The algorithm exploits the ability to independently calculate the Euclidean distances between the subsequences of the time series. Computations are paralleled using OpenMP and OpenAcc for the Intel MIC (Many Integrated Core) and NVIDIA GPU platforms, respectively. The algorithm consists of two stages, namely precomputations and discovery. At the precomputation stage, we construct the auxiliary matrix data structures to ensure the efficient vectorization of computations on an accelerator. At the discovery stage, the algorithm searches for a discord based on the constructed structures. A number of numerical experiments confirm a high scalability of the proposed algorithm.

Read full abstract

Discord Discovery Research Articles

Articles published on Discord Discovery

A Parallel Discord Discovery Algorithm for a Graphics Processor

A fast algorithm for complex discord searches in time series: HOT SAX Time

Model-free detection of unique events in time series

A GPU Acceleration Framework for Motif and Discord Based Pattern Mining

Matrix profile goes MAD: variable-length motif and discord discovery in data series

Параллельный алгоритм поиска диссонансов временного ряда для многоядерных ускорителей

A Multi-resolution Approximation for Time Series

ESPSA: A prediction-based algorithm for streaming time series segmentation

Finding time series discord based on bit representation clustering

Disk aware discord discovery: finding unusual time series in terabyte sized datasets

Disk aware discord discovery: finding unusual time series in terabyte sized datasets

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Discord Discovery Research Articles

Articles published on Discord Discovery

A Parallel Discord Discovery Algorithm for a Graphics Processor

A fast algorithm for complex discord searches in time series: HOT SAX Time

Model-free detection of unique events in time series

A GPU Acceleration Framework for Motif and Discord Based Pattern Mining

Matrix profile goes MAD: variable-length motif and discord discovery in data series

Параллельный алгоритм поиска диссонансов временного ряда для многоядерных ускорителей

A Multi-resolution Approximation for Time Series

ESPSA: A prediction-based algorithm for streaming time series segmentation

Finding time series discord based on bit representation clustering

Disk aware discord discovery: finding unusual time series in terabyte sized datasets

Disk aware discord discovery: finding unusual time series in terabyte sized datasets