Vocal emotion recognition with cochlear implants

Xin Luo,Qian-Jie Fu,John J Galvin Iii

doi:10.21437/interspeech.2006-505

Abstract

Besides conveying linguistic information, spoken language can also transmit important cues regarding the emotion of a talker. These prosodic cues are most strongly coded by changes in amplitude, pitch, speech rate, voice quality and articulation. The present study investigated the ability of cochlear implant (CI) users to recognize vocal emotions, as well as the relative contributions of spectral and temporal cues to vocal emotion recognition. An English sentence database was recorded for the experiment; each test sentence was produced according to five target emotions. Vocal emotion recognition was tested in 6 CI and 6 normal-hearing (NH) subjects. With unprocessed speech, NH listeners’ mean vocal emotion recognition performance was 90 % correct, while CI users’ mean performance was only 45 % correct. Vocal emotion recognition was also measured in NH subjects while listening to acoustic, sine-wave vocoder CI simulations. To test the contribution of spectral cues to vocal emotion recognition, 1-, 2-, 4-, 8and 16-channel CI processors were simulated; to test the contribution of temporal cues, the temporal envelope filter cutoff frequency in each channel was either 50 or 500 Hz. Results showed that both spectral and temporal cues significantly contributed to performance. With the 50-Hz envelope filter, performance generally improved as the number of spectral channels was increased. With the 500-Hz envelope filter, performance significantly improved only when the spectral resolution was increased from 1 to 2, and then from 2 to 16 channels. For all but the 16-channel simulations, increasing the envelope filter cutoff frequency from 50 Hz to 500 Hz significantly improved performance. CI users’ vocal emotion recognition performance was statistically similar to that of NH subjects listening to 1 - 8 spectral channels with the 50-Hz envelope filter, and to 1 channel with the 500-Hz envelope filter. The results suggest that, while spectral cues may contribute more strongly to recognition of linguistic information, temporal cues may contribute more strongly to recognition of emotional content coded in spoken language. Index Terms: vocal emotion recognition, cochlear implant.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Vocal emotion recognition with cochlear implants

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The Role of Spectral and Temporal Cues in Voice Gender Discrimination by Normal-Hearing Listeners and Cochlear Implant Users
Qian-Jie Fu ... John J Galvin
Journal of the Association for Research in Otolaryngology | VOL. 5
Qian-Jie Fu, et. al.Qian-Jie Fu ... John J Galvin
20 May 2004
Journal of the Association for Research in Otolaryngology | VOL. 5

Cochlear Implants Special Issue Article: Vocal Emotion Recognition by Normal-Hearing Listeners and Cochlear Implant Users
Xin Luo ... John J Galvin
Trends in Amplification | VOL. 11
Xin Luo, et. al.Xin Luo ... John J Galvin
01 Dec 2007
Trends in Amplification | VOL. 11

Age-Related Changes in Voice Emotion Recognition by Postlingually Deafened Listeners With Cochlear Implants.
Shauntelle A Cannon ... Monita Chatterjee
Ear & Hearing | VOL. 43
Shauntelle A Cannon, et. al.Shauntelle A Cannon ... Monita Chatterjee
16 Aug 2021
Ear & Hearing | VOL. 43

The role of spectral and temporal cues for vocal emotion recognition by cochlear implant simulations
Zhi Zhu ... Masashi Unoki
The Journal of the Acoustical Society of America | VOL. 141
Zhi Zhu, et. al.Zhi Zhu ... Masashi Unoki
01 May 2017
The Journal of the Acoustical Society of America | VOL. 141

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Vocal emotion recognition with cochlear implants

Abstract

Talk to us

Similar Papers