Abstract

Event Abstract Back to Event Spectrotemporal Modulations Underlying Speech and Timbre Perception Human speech and music are rich in spectral and temporal modulations, which are fluctuations in either amplitude or frequency. Speech intelligibility, melody perception, and identification of source characteristics (e.g., vocal gender or instrumental timbre) depend on spectrotemporal modulations but can withstand drastic spectral and temporal degradations. We systematically explored which restricted spectrotemporal modulations are essential to the perception of complex sounds. Degraded sentences were obtained by a novel modulation filtering procedure performed on the sound spectrogram. Temporal modulation filtering smeared the amplitude envelope by removing changes above a cutoff in Hz. Spectral modulation filtering smeared the spectral energy across frequency bands by removing changes above a cutoff in cyc/kHz. We further complemented this low-pass filtering with more specific notch-filtering. Speech intelligibility and gender recognition were assessed. We determined that spectral modulations below ~3.75 cyc/kHz, and temporal modulations between 1 and 7 Hz are essential for speech comprehension, whereas gender identification depends on the presence of higher modulations associated with the glottal pulse. We are expanding these psychoacoustic experiments to dissimilarity judgments of orchestral instrument tones that have been normalized in level, tuning and duration. Our goal is to relate the multiple perceptual dimensions of timbre, such as brightness, sharpness of attack, and spectral flux, to underlying acoustic differences in the spectrotemporal modulation spectrum. We will represent the timbral distance in spectrotemporal modulations using multidimensional scaling (MDS) [1]. Then subjects will rate tones along 15 subjective rating scales. We will perform a principal components analysis (PCA) on the timbre ratings to reduce their dimensionality, helping us to assign qualitative labels to the timbre space obtained by the MDS. Our research could be used in the design of optimal signal processing in hearing aids and cochlear implants, as well as in music synthesis and transcription. Acknowledgments We thank D. Wessel and J.-M. Mongeau for helpful discussion.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.