Abstract

This paper describes a method of automatically selecting types of responses in conversational dialog systems, such as back-channel responses, changing the topic, or expanding the topic, using acoustic features extracted from user utterances. These features include spectral information described by MFCCs and LSPs, pitch information expressed by F0, loudness, etc. A corpus of dialogues between elderly people and an interviewer was constructed, and the results of evaluation experiments showed that our method achieved an F-measure of 49.3% in a speech segment identification task. Moreover, further improvement was achieved by utilizing the delta coefficients of each feature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call