Abstract
This study investigates autocorrelation-based features as a potential basis for phonetic and syllabic distinctions. The work comes out of a theory of auditory signal processing based on central monaural autocorrelation and binaural crosscorrelation representations. Correlation-based features are used to predict monaural and binaural perceptual attributes that are important for the architectural acoustic design of concert halls: pitch, timbre, loudness, duration, reverberation-related coloration, sound direction, apparent source width, and envelopment (Ando, 1985, 1998; Ando and Cariani, 2009). The current study investigates the use of features of monaural autocorrelation functions (ACFs) for representing phonetic elements (vowels), syllables (CV pairs), and phrases using a small set of temporal factors extracted from the short-term running ACF. These factors include listening level (loudness), zero-lag ACF peak width (spectral tilt), τ1 (voice pitch period), φ1 (voice pitch strength), τe (effective duration of the ACF envelope, temporal repetitive continuity/contrast), segment duration, and Δφ1/Δt (the rate of pitch strength change, related to voice pitch attack-decay dynamics). Times at which ACF effective duration τe is minimal reflect rapid signal pattern changes that usefully demarcate segmental boundaries. Results suggest that vowels, CV syllables, and phrases can be distinguished on the basis of this ACF-derived feature set.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.