Abstract

In audio classification applications, features extracted from the frequency domain representation of signals are typically focused on the magnitude spectral content, while the phase spectral content is ignored. The conventional Fourier Phase Spectrum is a highly discontinuous function; thus, it is not appropriate for feature extraction for classification applications, where function continuity is required. In this work, the sources of phase spectral discontinuities are detected, categorized and compensated, resulting in a phase spectrum with significantly reduced discontinuities. The Hartley Phase Spectrum, introduced as an alternative to the conventional Fourier Phase Spectrum, encapsulates the phase content of the signal more efficiently compared with its Fourier counterpart because, among its other properties, it does not suffer from the phase ‘wrapping ambiguities’ introduced due to the inverse tangent function employed in the Fourier Phase Spectrum computation. In the proposed feature extraction method, statistical features extracted from the Hartley Phase Spectrum are combined with statistical features extracted from the magnitude related spectrum of the signals. The experimental results show that the classification score is higher in case the magnitude and the phase related features are combined, as compared with the case where only magnitude features are used.

Highlights

  • The spectral magnitude information reveals how the energy content of a signal is distributed across the frequency spectrum, i.e., the signal energy concentration across frequencies

  • The Hartley Phase Spectrum, introduced as an alternative to the conventional Fourier Phase Spectrum, encapsulates the phase content of the signal more efficiently compared with its Fourier counterpart because, among its other properties, it does not suffer from the phase “wrapping ambiguities” introduced due to the inverse tangent function employed in the Fourier Phase Spectrum computation

  • These results indicate that the phase spectral content is presented to the classifier in an improved manner using the Hartley Phase Spectrum (HPS) as compared with the Fourier Phase Spectrum (FPS)

Read more

Summary

Introduction

The spectral magnitude information reveals how the energy content of a signal is distributed across the frequency spectrum, i.e., the signal energy concentration across frequencies. The magnitude spectrum ignores the information related to the location of the aforementioned magnitude spectral components in the time domain. The information related to the location of the signal magnitude characteristics in the time domain as well as to the signal dynamics is encapsulated in the phase spectrum [1,2]. The phase spectrum is used for various speech processing applications such as formant extraction, pitch extraction, speech intelligibility, speech enhancement, iterative signal reconstruction and automatic speech recognition [1,2,6]. The conclusions derived from the review in [1] indicate that, for the automatic speech recognition application, the (processed) phase spectrum encapsulates class discriminative information not conveyed by the signal magnitude (e.g. the Mel-Frequency Cepstral Coefficients (MFCCs)). The experimental results of the present work, for classification of audio signals, agree with both the aforementioned claims

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.