Abstract

While facing complex surroundings, robots need to identify the same intention which is expressed in different ways. In order to solve the problem of assisting robots to get a better intention understanding, a self-tuning multimodal fusion algorithm is put forward in this paper, which is not restricted by the expressions of interacting participants and environment. The multimodal fusion algorithm can be transferred to different application platforms. Robots can own the understanding competence and adapt new tasks by changing the content of the robot knowledge base. Compared with other multimodal fusion algorithms, this paper attempts to transfer the basic structure of feed-forward neural networks on discrete sets, which has strengthened the consistency and perfect the complementary relations between multiple mode, and has realized the simultaneous operation of fusion operator’s self-tuning and intention search. There are three kinds of modes selected in the paper: speech, gesture, and scene objects, where the single modal classifiers are trained separately. This method conducted a human-computer interaction experiment on the bionic robot Pepper platform, which proved that the method can effectively improve the accuracy and robustness of robots in aspects of understanding human intentions, and reduce the uncertainty about intention judgment in a single modal interaction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call