Abstract

Data compression techniques hinged on the notion of a motif are presented, interpreted here as a string of intermittently solid and wild characters that recurs more or less frequently in an input sequence or family of sequences. Correspondingly, motif discovery techniques and tools have been devised. This task is made difficult by the circumstance that the number of motifs identifiable in general in a sequence can be exponential in the size of that sequence. A significant gain in the direction of reducing the number of motifs is achieved through the introduction of irredundant motifs, which in intuitive terms are a combination of other motif occurrences. The number of abundant motifs in a sequence is at worst linear in the sequence. It is shown that irredundant motifs can be usefully exploited in lossy compression methods based on textual substitution and suitable for signals as well as text. Preliminary experiments with these fungible strategies at the crossroads of lossless and lossy data compression show performances that improve over popular methods by more than 20% in lossy and 10% in lossless implementations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.