Abstract

Analyzing historical time series data in the form of replay is very important. When replaying massive sensor data, long processing times predominantly occur, that significantly affects the performance of the whole system. Existing solutions do not consider the requirements of large-scale sensor data replay. Therefore, a data replay mechanism based on a concurrent buffer pool is proposed. In this mechanism, a multi-level buffer queue is designed and implemented using a priority queue to ensure the efficiency and stability during the replaying process. Then, a task queue is adopted to ensure the correct order when multiple threads fill the multi-level buffer queue. Moreover, an anomaly detection operator is integrated to reduce outliers in replay data. Using the real hydrologic dataset, the data replay mechanism is evaluated by several experiments. The results show that the data replay mechanism has better performance compare to Apache Phoenix and Apache Hive.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call