Pitch segmentation of speech signals based on short-time energy waveform

Sopon Wiriyarattanakul,Nawapak Eua-Anant

doi:10.1007/s10772-017-9459-4

Abstract

In general, speech is constituted of quasi-repetitive patterns called pitches representing the speech fundamental period and tonal information of the voice. Extraction of pitch information that is crucial for many speech processing techniques, usually faces a noise problem and interference caused by high-order harmonic components. This paper introduces a novel, noise-robust method for determining speech fundamental frequency and pitch segmentation, based on a short-time energy waveform (SEW), defined as a moving average squared signal. When applying a moving average filter with a window size closed to the fundamental period, nearly repetitive patterns, with fewer ripples, synchronizing with actual pitches can clearly be observed in the SEW. The DC component in the SEW is removed using morphological top-hat and bottom-hat transforms. The fundamental frequency is determined as the frequency corresponding to the largest peak of the power spectrum of the DC-removed SEW. Finally, a time-domain window search is then performed to locate local extrema associated with pitches. Compared to traditional pitch detection techniques, the proposed technique yields pitch segmentation results with a higher rate of accuracy and greater noise robustness.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Pitch segmentation of speech signals based on short-time energy waveform

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Sep 19, 2017
Citations: 2

Similar Papers

Estimation of the Foundation Condition of a Fixed Platform by Measurement of Dynamic Response
M.E Utt ... F.C Duthweiler
-
M.E Utt, et. al.M.E Utt ... F.C Duthweiler
02 May 1976
02 May 1976

Performance Comparison of Finemet and Metglas Tape Cores Under Non-Sinusoidal Waveforms With DC Bias
Hiroyuki Kosai ... Zafer Turgut
IEEE Transactions on Magnetics | VOL. 52
Hiroyuki Kosai, et. al.Hiroyuki Kosai ... Zafer Turgut
01 Jul 2016
IEEE Transactions on Magnetics | VOL. 52

Control Strategy Based on Arm-Level Control for Output and Circulating Current of MMC in Stationary Reference Frame
Waqar Uddin ... Kamran Zeb
Energies | VOL. 14
Waqar Uddin, et. al.Waqar Uddin ... Kamran Zeb
09 Jul 2021
Energies | VOL. 14

A new formulation for unsteady heat transfer of oscillatory flow in a circular tube
...
Transactions of The Korean Society of Mechanical Engineers B | VOL. 20
, et. al. ...
01 Jan 1996
Transactions of The Korean Society of Mechanical Engineers B | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pitch segmentation of speech signals based on short-time energy waveform

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology