To respond appropriately to an utterance, human-like communication system, should consider not only words in the utterance but also the speaker’s emotion. We thus proposed a natural language dialog system that can estimate the user’s emotion from utterances and respond on the basis of the estimated emotion. To estimate a speaker’s emotion (positive, negative, or neutral), 384 acoustic features extracted from an utterance are utilized by a Support Vector Machine (SVM). Artificial Intelligence Markup Language (AIML)-based response generating rules are expanded so that the speaker’s emotion can be considered as a condition of these rules. Two experiments were carried out to compare impressions of a dialog agent that considered emotion (proposed system) with those of an agent that did not (previous system). In the first experiment, 10 subjects evaluated the impressions after watch four conversation videos (no emotion estimation, correct emotion estimation, inadequate emotion estimation, and imperfect emotion estimation). In the second experiment, another 10 subjects evaluated the impressions after talk with both dialog agents. These experimental results and a demonstration of the proposed system will be shown in the presentation.
Read full abstract