We present an analysis of the rate of sign changes in the discrete Fourier spectrum of a sequence. The sign changes of either the real or imaginary parts of the spectrum are considered, and the rate of sign changes is termed as the spectral zero-crossing rate (SZCR). We show that SZCR carries information pertaining to the locations of transients within the temporal observation window. We show duality with temporal zero-crossing rate analysis by expressing the spectrum of a signal as a sum of sinusoids with random phases. This extension leads to spectral-domain iterative filtering approaches to stabilize the spectral zero-crossing rate and to improve upon the location estimates. The localization properties are compared with group-delay-based localization metrics in a stylized signal setting well-known in speech processing literature. We show applications to epoch estimation in voiced speech signals using the SZCR on the integrated linear prediction residue. The performance of the SZCR-based epoch localization technique is competitive with the state-of-the-art epoch estimation techniques that are based on average pitch period.
Read full abstract