An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

Jian Fang,H Peter Hofstee,Jianyu Chen,Jinho Lee,Zaid Al-Ars

doi:10.1007/s11265-020-01547-w

Abstract

To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write commands (for literal tokens) and read commands (for copy tokens) to a set of commands that target a single bank of block ram, and rather than performing all the dependency calculations saves logic by recycling (read) commands that return with an invalid result. A single “Snappy” decompressor implemented in reconfigurable logic leveraging this method is capable of processing multiple literal or copy tokens per cycle and achieves up to 7.2GB/s, which can keep pace with an NVMe device. The proposed method is about an order of magnitude faster and an order of magnitude more power efficient than a state-of-the-art single-core software implementation. The logic and block ram resources required by the decompressor are sufficiently low so that a set of these decompressors can be implemented on a single FPGA of reasonable size to keep up with the bandwidth provided by the most recent interface technologies.

Highlights

Compression and decompression algorithms are widely used to reduce storage space and data transmission bandwidth
Rather than spending a lot of logic on calculating the dependencies and scheduling operations, a recycle method is used where each BRAM command executes immediately and those that return with invalid data are recycled to avoid stalls caused by the RAW dependency
The field programmable gate arrays (FPGAs) design is compared with an optimized software Snappy decompression implementation [1] compiled by gcc 7.3.0 with “O3” option and running on a POWER9 CPU in little endian mode with Ubuntu 18.04.1 LTS

Summary

Introduction

Compression and decompression algorithms are widely used to reduce storage space and data transmission bandwidth. Compression and decompression are computation-intensive applications and can consume significant CPU resources. This is especially true for systems that aim to combine in-memory analytics with fast storage such as can be provided by multiple NVMe drives. With the best CPUbased Snappy decompressors reaching 1.8GB/s per core 40 cores are required just to keep up with this decompression bandwidth. To release CPU resources for other tasks, accelerators such as graphic processing units (GPUs) and field programmable gate arrays (FPGAs) can be used to accelerate the compression and decompression

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Signal Processing Systems	Publication Date: May 28, 2020
Citations: 6	License type: open-access

R Discovery Prime

R Discovery Prime

An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Signal Processing Systems

Lead the way for us

Similar Papers

Teaching Reconfigurable Systems by RAM-Based FSM Designing
Binyamin Abramov ... Vladimir Ostrovsky
-
Binyamin Abramov, et. al.Binyamin Abramov ... Vladimir Ostrovsky
01 Jan 2008
01 Jan 2008

Small virtual channel routers on FPGAs through block RAM sharing
... Tor M Aamodt
-
, et. al. ... Tor M Aamodt
01 Dec 2012
01 Dec 2012

Leap scratchpads
Michael Adler ... Michael Pellauer
-
Michael Adler, et. al.Michael Adler ... Michael Pellauer
27 Feb 2011
27 Feb 2011

Implementation of an Efficient Design of Multi ported Memory on FPGA
A Priya ... P Thenmozhi
International Journal of Advance Research and Innovation | VOL. 7
A Priya, et. al.A Priya ... P Thenmozhi
01 Jan 2019
International Journal of Advance Research and Innovation | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Signal Processing Systems