Abstract

We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications. We propose a clustering-based data permutation approach for improving compression ratio for data with significant alphabet variation along with a faster string sorting approach based on the application of the [Formula: see text] complexity counting sort with permutation reindexing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call