Abstract

Many IoT (Internet of Things) applications, like the industrial internet and the smart city, collect data continuously from massive sensors. It is crucial to exploit and analyze the time series data efficiently. Subsequence matching is a fundamental task in mining time series data. Most existing works develop the index and the matching approach for the static time series data. However, IoT applications need to continuous collect new data and deposit huge historical time series data, which pose a significant challenge for the static indexing approach. To address this challenge, we propose a lightweight index structure, L-index, and a matching approach, L-match, for the constraint normalized subsequence matching problem (cNSM). L-index is a two-layer structure and built on the simple series synopsis, the mean values of the disjoint windows. It is easy to build and update as data grows. Moreover, to further improve the efficiency for the variable query lengths, an optimization technique, named SD-pruning, is proposed. We conduct extensive experiments, and the results verify the effectiveness and efficiency of the proposed approach.

Highlights

  • Recent advances in sensing, networking and storage technologies have made IoT (Internet of Things) applications increase tremendously, like the smart Grid, industrial internet and wearable devices, which generate and collect a large amount of time series from a wide variety of domains.Once the time series data have been collected, the data scientists begin to exploit and analyze these time series data [1], [2]

  • Given a long time series T, for any query series Q and a distance threshold ε, the subsequence matching problem finds all subsequences from T, whose distance with Q falls within the threshold ε

  • To solve the constraint normalized subsequence matching problem (cNSM) problem in the IoT applications, we propose a two-layer lightweight index, L-index, which is easy to build and update

Read more

Summary

Introduction

Recent advances in sensing, networking and storage technologies have made IoT (Internet of Things) applications increase tremendously, like the smart Grid, industrial internet and wearable devices, which generate and collect a large amount of time series from a wide variety of domains.Once the time series data have been collected, the data scientists begin to exploit and analyze these time series data [1], [2]. Given a long time series T , for any query series Q and a distance threshold ε, the subsequence matching problem finds all subsequences from T , whose distance with Q falls within the threshold ε. Many approaches have been proposed to improve the efficiency [4], [5] or to deal with various distance functions [6], [7], such as Euclidean distance and Dynamic Time Warping. All these approaches only consider the raw subsequence matching problem (RSM for short).

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call