Effect of Spectrogram Parameters and Noise Types on The Performance of Spectro-temporal Peaks Based Audio Search Method

Murat Köseoğlu,Hakan Uyanik

doi:10.35378/gujs.1000594

Abstract

Audio search algorithms are used to detect the queried file in large databases, especially in multimedia applications. These algorithms are expected to perform the detection in a reliable and robust way within the shortest time. In this study, based on spectral peaks method, an audio fingerprint algorithm with a few minor modifications was developed to detect the matching audio file in target database. This method has two stages as the audio fingerprint extraction and matching. In the first stage, fingerprint features are extracted from spectral peaks on the spectrograms of audio files by hash functions. This state-of-art technique reduces the processing load and time considerably compared to traditional methods. In the second stage, fingerprint data of the queried file are compared with the data created in the first stage in the database. The algorithm was demonstrated, and the effect of spectrogram parameters (window size, overlap, number of FFT) was investigated by considering reliability and robustness under different noise sources. Also, it was aimed to contribute to new audio retrieval studies based on spectral peaks method. It was observed that the variation in the spectrogram parameters significantly affected the number of matchings, reliability and robustness. Under high noise conditions, the optimal spectrogram parameters were determined as 512 (window size), 50% (overlap), 512 (number of FFT). It was seen in general that the algorithm successfully detected the queried file in the database even in high noise conditions for these parameters. No significant effect of music genre was observed.

Full Text