Abstract

This paper presents a method that uses a web questionnaire to create a corpus containing spontaneous utterances of natural ideas, which may contain grammatical mistakes. In an experimental implementation of the method, the subjects were informed that they were receiving nursing care from a person, and they were required to answer a web-based questionnaire in which their responses were recorded as speech utterances. Compared to the Wizard of Oz approach and interview-based corpus-creation methods, the presented method simplifies the collection of utterances. Furthermore, we conducted a two-fold assessment to verify the effectiveness of the presented method. First, the approach exhibited a significant reduction in workload compared to interview-style utterance collection. Second, we compared the variety of expressions collected when subjects were informed that they were talking to a person with those collected when they were informed that they were communicating with a nursing robot. The results indicate that, although the number of utterances was larger for a robot than for a person, in terms of other metrics such as time efficiency index, the total number of morphemes, the average number of morphemes per utterance, the number of unique morphemes, and coefficient of variation, the utterances were larger for a human speech target than for a robot.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call