Abstract
We discuss an improved method of Variable-to-Fixed length code (VF code) encoding. A VF code is a coding scheme that splits an input text into a consecutive sequence of substrings, and then assigns a fixed length codeword to each substring. Since all the codewords have the same length, they are easily extracted, and thus, VF code is suitable for processing compressed texts directly. Almost Instantaneous VF code (AIVF code), which was proposed by Yamamoto and Yokoo in 2001, achieves a good compression ratio by using a set of parse trees. However, it requires more time and space for both encoding and decoding than does a classical VF code. In this paper, we prove that the set of parse trees of AIVF code can be multiplexed into a compact single tree and the original encoding and decoding procedures can be simulated using the compacted tree. We also give the upper and lower bounds of the number of nodes in the multiplexed parse tree, in addition to those of the number of nodes reduced from the original trees. The experimental results showed that the proposed method can encode natural language texts much faster than AIVF coding.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have