Abstract

Currently, in many subject areas, the processing of sensor data in real time assumes imputation of values missed due to a technical failure or a human factor. This article proposes a parallel algorithm for imputation the missing values of a streaming time series in real time for a many-core processor. The algorithm employs a set of reference time series that have a semantic relationship with the original time series. The algorithm exploits the following heuristics: if there are repeated (similar) subsequences in the reference time series, then in the time series containing the missing value, repeated subsequences occur in the same time intervals. For each reference time series, a query is defined as a subsequence of a given length ending at the moment when the value in the original time series was missed. The similarity of the subsequences with the query is determined based on the DTW (Dynamic Time Warping) measure that is of quadratic computational complexity relative to the subsequence length. The algorithm employs the lower bounding technique to discard subsequences that are obviously dissimilar to the query, without calculating DTW. The lower bounds have less complexity than DTW and are calculated in parallel. The imputed value is calculated as the arithmetic mean of the last elements of the found intervals. In computational experiments, the proposed algorithm demonstrates high imputation accuracy in comparison with analogs and performance acceptable for real-time applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call