Abstract

Time series data are often with regular time intervals, e.g., in IoT scenarios sensor data collected with a pre-specified frequency, air quality data regularly recorded by outdoor monitors, and GPS signals periodically received from multiple satellites. However, due to various issues such as transmission latency, device failure, repeated request and so on, timestamps could be dirty and lead to irregular time intervals. Amending the irregular time intervals has obvious benefits, not only improving data quality but also leading to more accurate applications such as frequency-domain analysis and more effective compression in storage. The timestamp repairing problem however is challenging, given many interacting factors to determine, including the time interval, the start timestamp, the series length, as well as the matching between the time series before and after repairing. Our contributions in this paper are (1) formalizing the timestamp repairing problem for regular interval time series to minimize the cost w.r.t. move, insert and delete operations; (2) devising an exact approach with advanced pruning strategies based on lower bounds of repairing; (3) proposing an approximation based on bi-directional dynamic programming. The experimental results demonstrate the superiority of our proposal in both timestamp repair accuracy and the aforesaid applications. Remarkably, the repair results can be used to evaluate time series data quality measures. Both the repair and measure functions have been implemented in an open-source time series database, Apache IoTDB.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call