Abstract

High audio data compression can be achieved by removing irrelevant signal information that is not detectable by even a well-trained or sensitive listener. Contemporary audio coding schemes like MP3, AAC, and Ogg Vorbis identify the irrelevant information during signal analysis by incorporating into the coder several psychoacoustic principles, including absolute hearing thresholds, critical band analysis, simultaneous masking, and temporal masking (Painter and Spanias, 2000). Masking is the process of removing faint but normally audible sound signals that are rendered inaudible as they are very close in frequency to or have much smaller amplitudes than surrounding sounds. Numerous studies have been conducted on genetic algorithms, which solve problems by modeling the Darwinian evolution. The algorithms have been recently applied to audio coding with some success (Galos et al., 2003). To achieve audio compression, genetic algorithms analyze a large number of sound files to determine the chunks that are most likely to contain irrelevant signals. The combination of the irrelevant chunks, form a solution which will be used to compress any sound files. We present in this paper a study of the comparison of applying psychoacoustic principles and genetic algorithms to compress audio signals. We developed a coder to perform the experiment, where like most well-known audio coders, Huffman coding is used to handle lossless compression and modified discrete cosine transform (MDCT) is used to transform the time-domain signals to the frequency domain. The results are compared using signal-to-noise ratios (SNRs) and subjective testing, where eighteen subjects (who are students in CSUSB) are asked to listen and rate the decompressed files by the two methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call