An efficient Variable-to-Fixed length encoding using multiplexed parse trees

Satoshi Yoshida,Takuya Kida

doi:10.1016/j.jda.2014.10.005

Satoshi Yoshida, Takuya Kida

https://doi.org/10.1016/j.jda.2014.10.005

Copy DOI

Export

Save

Cite

Journal: Journal of Discrete Algorithms	Publication Date: Nov 13, 2014
Citations: 1	License type: elsevier-specific: oa user license

Affiliation: Hokkaido University

Abstract
Full-Text
Similar Papers

Abstract

Listen

We discuss an improved method of Variable-to-Fixed length code (VF code) encoding. A VF code is a coding scheme that splits an input text into a consecutive sequence of substrings, and then assigns a fixed length codeword to each substring. Since all the codewords have the same length, they are easily extracted, and thus, VF code is suitable for processing compressed texts directly. Almost Instantaneous VF code (AIVF code), which was proposed by Yamamoto and Yokoo in 2001, achieves a good compression ratio by using a set of parse trees. However, it requires more time and space for both encoding and decoding than does a classical VF code. In this paper, we prove that the set of parse trees of AIVF code can be multiplexed into a compact single tree and the original encoding and decoding procedures can be simulated using the compacted tree. We also give the upper and lower bounds of the number of nodes in the multiplexed parse tree, in addition to those of the number of nodes reduced from the original trees. The experimental results showed that the proposed method can encode natural language texts much faster than AIVF coding.

Full Text