Abstract

A novel feature set for low-dimensional signal representation, designed for classification or clustering of non-stationary signals with complex variation in time and frequency, is presented. The feature representation of a signal is given by the first left and right singular vectors of its ambiguity spectrum matrix. If the ambiguity matrix is of low rank, most signal information in time direction is captured by the first right singular vector while the signal’s key frequency information is encoded by the first left singular vector. The resemblance of two signals is investigated by means of a suitable similarity assessment of the signals’ respective singular vector pair. Application of multitapers for the calculation of the ambiguity spectrum gives an increased robustness to jitter and background noise and a consequent improvement in performance, as compared to estimation based on the ordinary single Hanning window spectrogram. The suggested feature-based signal compression is applied to a syllable-based analysis of a song from the bird species Great Reed Warbler and evaluated by comparison to manual auditive and/or visual signal classification. The results show that the proposed approach outperforms well-known approaches based on mel-frequency cepstral coefficients and spectrogram cross-correlation.

Highlights

  • In biology, bird song analysis has been a large field for several decades, and for many years, methods based on spectrograms have been considered wellsuited for the comparison of bird sounds

  • We evaluate the performance of the raw similarity measures βu and βv individually, based on different settings for the MT windows

  • 7.2 Evaluation of combined measures. In this part we include the combined measures from Eqs. (18, 19 and 20) in our analysis and investigate which of the similarity measures βmean, βmax, βmin, βu, βv performs best when features are extracted from ambiguity spectrum (AS) and the computation of AS is based on MT8(13, 0.88)

Read more

Summary

Introduction

Bird song analysis has been a large field for several decades, and for many years, methods based on spectrograms (sonograms) have been considered wellsuited for the comparison of bird sounds. In contrast to TF distributions, that aim at optimal resolution of signal components and cross-term suppression, [18], MT spectrograms are more suitable for the type of data considered in this paper, as MTs are expected to smooth out small differences in time and frequency locations and lower the in-class variance. The main contribution of this paper is the introduction of a feature set based on SVD of the AS on the basis of which, e.g., classification and clustering tasks of nonstationary signals can be performed The latter may be conducted in terms of a similarity measure. As there is no actual time and frequency location where the two spectrogram images coincide to sufficiently large extend, an SPCC-based syllable comparison will not clearly reveal the striking structural similarities between these signals. The maximum cross-correlation based on the MT spectrograms is 0.815, not strongly suggesting the substantial

Feature extraction—singular value decomposition
Evaluation of combined measures
Comparison with established approaches
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call