Abstract
We address the problem of inferring a speaker's level of certainty based on prosodic information in the speech signal, which has application in speech-based dialogue systems. We show that using phrase-level prosodic features centered around the phrases causing uncertainty, in addition to utterance-level prosodic features, improves our model's level of certainty classification. In addition, our models can be used to predict which phrase a person is uncertain about. These results rely on a novel method for eliciting utterances of varying levels of certainty that allows us to compare the utility of contextually-based feature sets. We elicit level of certainty ratings from both the speakers themselves and a panel of listeners, finding that there is often a mismatch between speakers' internal states and their perceived states, and highlighting the importance of this distinction.
Highlights
Speech-based technology has become a familiar part of our everyday lives
If we enable computers to do the same, we can improve how applications such as spoken tutorial dialogue systems [2], language learning systems [3], and voice search applications [4] interact with users
We find that our basic prosody model has lower RMS error than the nonprosodic baseline model: 0.738 compared to 1.059
Summary
While most people can think of an instance where they have interacted with a call-center dialogue system, or command-based smartphone application, few would argue that the experience was as natural or as efficient as conversing with another human. To build computer systems that can communicate with humans using natural language, we need to know more than just the words a person is saying; we need to have an understanding of his or her internal mental state. Level of certainty is an important component of internal state. If we enable computers to do the same, we can improve how applications such as spoken tutorial dialogue systems [2], language learning systems [3], and voice search applications [4] interact with users
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.