Abstract

The Fixed Constraints Transform (FCT) encodes the text based on a dictionary. This dictionary is used to accomplish the connections between the words in the text and their corresponding transforms. The dictionary is generated one time and it is saved in a binary form for a better word-indexing speed. This method is strictly designed for text compression and it has maximum performances when the text has normal formatting - in a phrase, only the first word starts with upper case, and it continues with lower case. Because the algorithm is based on modification of the words in the text, on unaltered signs of punctuation, on spaces and other special characters, the algorithms performance is given by the ratio between the number of letters in the text and the total number of characters. FCT has close performances with other frequently used transforms - Star, Burrows-Wheeler, etc. - in terms of compression, but it has better execution speed. The applied algorithms of lossless data compression for testing are: RLE (Run-Length Encoding), arithmetic, PPMd (Prediction by Partial Matching), BZip2, Deflate (WinZip), LAMA, and RAR. The following indicators of compression performance were measured: the requested time for transform generation, the compression rate, and the requested time for compression. The text files used for evaluating the performance are from the Calgary Corpus. FCT leads to compression performances close to the ones obtained by the usual transforms used as pre-compression methods (BWT, Star Transform and derivatives). FCT is suited for the use of a chain of processors that have as purpose lossless data compression. The transform itself does not do a performing compression, but - most important - it helps a compression algorithm applied after it with the fact that it eliminates some redundant information and specific features to the idioms written in a certain language. FCT is an efficient method of data processing with notable results that can be very easily implemented and used in a lossless compression chain both for stream sequences and files in usual applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.