Abstract
Cochlear implant (CI) listeners were found to have great difficulty with vocal emotion recognition because of the limited spectral cues provided by CI devices. Previous studies have shown that the modulation spectral features of temporal envelopes may be important cues for vocal emotion recognition of noise-vocoded speech (NVS) as simulated CIs. In this paper, the feasibility of vocal emotion conversion on a modulation spectrogram for simulated CIs for correctly recognizing vocal emotion is confirmed. A method based on a linear prediction scheme is proposed to modify the modulation spectrogram and its features of neutral speech to match that of emotional speech. The logic of this approach is that if vocal emotion perception of NVS is based on the modulation spectral features, NVS with similar modulation spectral features of emotional speech will be recognized as the same emotion. As a result, it was found that the modulation spectrogram of neutral speech can be successfully converted to that of emotional speech. The results of the evaluation experiment showed the feasibility of vocal emotion conversion on the modulation spectrogram for simulated CIs. The vocal emotion enhancement on the modulation spectrogram was also further discussed.
Submitted Version (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have