Abstract
Compression-based pattern recognition measures the similarity between objects with relying on data compression techniques. This paper improves the current compression-based pattern recognition by exploiting new useful features which are easy to obtain. In particular, we study the two known methods called PRDC (Pattern Representation on Data Compression) and NMD (Normalized Compression Distance). PRDC represents an object x with a feature vector that lines up the compression ratios derived by compressing x with multiple dictionaries. We smartly enhance PRDC by extracting new novel features from the compressed files. NMD measures the similarity between two objects by comparing their compression dictionaries. We extend NMD by incorporating the length of words in the dictionaries into the similarity measure.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.