Selection of a voice for a speech signal for personalized warnings: the effect of speaker’s gender and voice pitch

Sheron Machado,Lara Reis,Júlia Teles,Emília Duarte,Francisco Rebelo

doi:10.3233/wor-2012-0670-3592

Sheron Machado, Lara Reis + Show 3 more

Open Access

https://doi.org/10.3233/wor-2012-0670-3592

Copy DOI

Abstract

There is an increasing interest in multimodal technology-based warnings, namely those conveying speech-warning statements. This type of warning may be tailored to the situation as well as to the target user's characteristics. However, more information is needed on how to design these warnings in a way that ensures intelligibility, promotes compliance and reduces the potential for annoyance. In this context, this paper reports an exploratory study whose main purpose was to assist the selection of a synthesized voice for a subsequent compliance study with personalized (i.e., using the person's name) technology-based warnings using Virtual Reality. Participants were requested to listen to speech signals, gathered from a speech synthesizer and post-processed in order to change the pitch perception, and then these were evaluated by fulfilling the MOS-X questionnaire. After that, the participants ranked the voices according to their preference. The effects of the speaker's gender and voice pitch, on both ratings and ranking were assessed. The preference of the male and female listeners for a talker's voice gender was also investigated. The results show that participants mostly prefer as first choice the high-pitched female voice, which also gathered the highest overall score in the MOS-X questionnaire. No significant influence of the participants' gender was found on the assessed measures.

Full Text