Abstract

Aggregation operations play an essential role in time series database management. As the number of data increases, it is difficult for current solutions, such as summary table and MapReduce-based methods to respond to such queries with low latency. Other approaches, such as segment tree-based methods, have a poor insertion performance when the data size exceeds the available memory. This paper proposes a Persistent Index for Segmented Aggregations (PISA), which has fast insertion performance and low latency for aggregation queries. PISA uses a forest to overcome the performance disadvantage of insertion in traditional segment trees. By defining two kinds of tags, namely code number and serial number, we propose an algorithm to accelerate queries by avoiding unnecessary reading data on disk. Additionally, we extend it to Dual-PISA to tolerate a range of unordered data, which is very important in the real world. Dual-PISA is stored on disk and is hugely memory-efficient — only takes a few hundred bytes of memory for billions of data points. Dual-PISA can be easily implemented on both traditional databases and NoSQL systems. It handles aggregation queries within milliseconds on a commodity server, for a time range that contains tens of billions of data points.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.