Experimental comparison of data comprassion algorithms

Andrii Hlybovets,Volodymyr Yablonskyi

doi:10.18523/2617-3808.2019.2.43-49

Abstract

The amount of data that is stored and transferred grows regularly and rapidly. When it comes to transferring large data volumes, a data compression algorithm can be useful. A well-chosen data compression algorithm can reduce the size of the data to up to 60%. The problem of creating new and modifying or optimizing old algorithms is up to date. This article discloses some of the most widespread algorithms of data comparison; more exactly, four of them. Shannon–Fano coding, named after Claude Shannon and Robert Fano, is a technique for constructing a prefix code based on a set of symbols and their probabilities. It is suboptimal in the sense that it does not achieve the lowest possible expected code word length like Huffman coding. Huffman code is a particular type of the optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code proceeds by means of Huffman coding; an algorithm developed by David A. Huffman. Lempel–Ziv–Storer–Szymanski (LZSS) is a lossless data compression algorithm, a derivative of LZ77 that was created in 1982 by James Storer and Thomas Szymanski. The main difference between LZ77 and LZSS is that in LZ77 the dictionary reference could actually be longer than the string it was replacing. In LZSS, such references are omitted if the length is less than the “break even” point. Furthermore, LZSS uses one-bit flags to indicate whether the next chunk of data is a literal (byte) or a reference to an offset/length pair. Lempel–Ziv–Welch (LZW) is a lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published as an improved implementation of the LZ78 algorithm. As part of the work on the article, these algorithms were implemented and an experimental analysis of their quality and speed was carried out. Those experiments gave the conclusion that the best compression speed results were shown by LZSS and the best compression ratio was reached by LZW. The work can be useful for researchers in the field of data compression.

Highlights

Об’єми даних, які зберігаються та передаються, постійно ростуть
Вона кодується як індекс її префіксу плюс додатковий символ
Habr [Электронный ресурс] : Алгоритмы LZW, LZ77 и LZ78. – 2014. – Режим доступа: https://habr.com/en/post/132683. – Заглавие с экрана

Summary

Кодування Шеннона–Фано

Символи первинного алфавіту m1 виписують у порядку зменшення ймовірностей. Коли розмір підалфавіту стає рівним нулю або одиниці, то наступне подовження префіксного коду для відповідних йому символів первинного алфавіту не відбувається. На кроці ділення алфавіту існує неоднозначність, оскільки різниця сумарних ймовірностей p_0 - p_1 може бути однакова для двох варіантів поділу (враховуючи, що всі символи первинного алфавіту мають імовірність більше нуля). Воно розбивається на дві підмножини з приблизно однаковими сумарними ймовірностями. Далі кожна з цих підмножин розбивається на дві підмножини з приблизно однаковими сумарними ймовірностями. При побудові коду Шеннона–Фано розбиття множини елементів може бути обрано, взагалі кажучи, декількома способами. Тому код Шеннона–Фано не є оптимальним у загальному сенсі, хоча і дає оптимальні результати при деяких розподілах ймовірностей. Кодування Шеннона–Фано є досить старим методом стиснення і на сьогодні не становить особливого практичного інтересу. Але на деяких послідовностях можуть сформуватися неоптимальні коди Шеннона–Фа но, тому більш ефективним вважають стиснення методом Хаффмана [1; 4]

Алгоритм Хаффмана

Алгоритм LZSS

Алгоритм LZW

Технології і середовища реалізації

Реалізація алгоритму Шеннона–Фано

Реалізація алгоритму Хаффмана

Реалізація алгоритму LZSS

Реалізація алгоритму LZW

Порівняння часу виконання

Порівняння ефективності компресії

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Experimental comparison of data comprassion algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: NaUKMA Research Papers. Computer Science

Lead the way for us

Journal: NaUKMA Research Papers. Computer Science	Publication Date: Dec 2, 2019
License type: cc-by

Similar Papers

Analysis for Lossless Data Compression Algorithms for Low Bandwidth Networks
Balachandra Pattanaik ... Wogderes Semunigus
Journal of Physics: Conference Series | VOL. 1964
Balachandra Pattanaik, et. al.Balachandra Pattanaik ... Wogderes Semunigus
01 Jul 2021
Journal of Physics: Conference Series | VOL. 1964

Empirical and Statistical Evaluation of the Effectiveness of Four Lossless Data Compression Algorithms
A.A Lasisi ... N.A Azeez
Nigerian Journal of Technological Development | VOL. 13
A.A Lasisi, et. al.A.A Lasisi ... N.A Azeez
13 Mar 2017
Nigerian Journal of Technological Development | VOL. 13

Performance Measurement and Comparison of Lossless Compression Algorithms
...
Global Journal of Enterprise Information System | VOL. 3
, et. al. ...
01 Dec 2011
Global Journal of Enterprise Information System | VOL. 3

Efficient seismic response data storage and transmission using ARX model-based sensor data compression algorithm
Yunfeng Zhang ... Jian Li
Earthquake Engineering & Structural Dynamics | VOL. 35
Yunfeng Zhang, et. al.Yunfeng Zhang ... Jian Li
01 Jan 2006
Earthquake Engineering & Structural Dynamics | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Experimental comparison of data comprassion algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: NaUKMA Research Papers. Computer Science