Abstract

ABSTRACT We proposed and evaluated a probabilistic model that generates nod motions based on utterance categories estimated from the speech input. The model comprises two main blocks. In the first block, dialog act-related categories are estimated from the input speech. Considering the correlations between dialog acts and head motions, the utterances are classified into three categories having distinct nod distributions. Linguistic information extracted from the input speech is fed to a cluster of classifiers which are combined to estimate the utterance categories. In the second block, nod motion parameters are generated based on the categories estimated by the classifiers. The nod motion parameters are represented as probability distribution functions (PDFs) inferred from human motion data. By using speech energy features, the parameters are sampled from the PDFs belonging to the estimated categories. The effectiveness of the proposed model was evaluated using an android robot, through subjective experiments. Experiment results indicated that the motions generated by our proposed approach are considered more natural than those of a previous model using fixed nod shapes and hand-labeled utterance categories.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call