Abstract

In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers. Encoding and, most importantly, decoding of these arrays consumes considerable CPU time. Therefore, substantial effort has been made to reduce costs associated with compression and decompression. In particular, researchers have exploited the superscalar nature of modern processors and SIMD instructions. Nevertheless, we introduce a novel vectorized scheme called SIMD-BP128 that improves over previously proposed vectorized approaches. It is nearly twice as fast as the previously fastest schemes on desktop processors (varint-G8IU and PFOR). At the same time, SIMD-BP128 saves up to 2 bits per integer. For even better compression, we propose another new vectorized scheme (SIMD-FastPFOR) that has a compression ratio within 10% of a state-of-the-art scheme (Simple-8b) while being two times faster during decoding.

Highlights

  • Computer memory is a hierarchy of storage devices that range from slow and inexpensive to fast but expensive

  • Varint-G8IU, SIMD-BP128, and SIMD-FastPFOR remain within a factor of six of the

  • We have presented new schemes that are up to twice as fast as the previously best available schemes in the literature while offering competitive compression ratios and encoding speed

Read more

Summary

Introduction

Computer memory is a hierarchy of storage devices that range from slow and inexpensive (disk or tape) to fast but expensive (registers or CPU cache). Application performance is inhibited by access to slower storage devices, at lower levels of the hierarchy. Only disks and tapes were considered to be slow devices. Application developers tended to optimize only disk and/or tape I/O. CPUs have become so fast that access to main memory is a limiting factor for many workloads [1, 2, 3, 4, 5]: data compression can significantly improve query performance by reducing the main-memory bandwidth requirements. High speed compression schemes can improve the performances of database systems [6, 7, 8] and text retrieval engines [9, 10, 11, 12, 13]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.