Abstract
We demonstrate an intelligent conversational agent system designed for advancing human-machine collaborative tasks. The agent is able to interpret a user’s communicative intent from both their verbal utterances and non-verbal behaviors, such as gestures. The agent is also itself able to communicate both with natural language and gestures, through its embodiment as an avatar thus facilitating natural symmetric multi-modal interactions. We demonstrate two intelligent agents with specialized skills in the Blocks World as use-cases of our system.
Highlights
Recent advances in speech recognition and natural language processing techniques have resulted in increasing use of intelligent assistants, such as Google Assistant, Siri, and Alexa, in our daily lives, replacing keyboard or touch interfaces
We have demonstrated a system for symmetric natural communication with a computer which can interact with its users with verbal and non-verbal communication allowing it to have more robust conversation
We demonstrated two use cases in the BW domain
Summary
Recent advances in speech recognition and natural language processing techniques have resulted in increasing use of intelligent assistants, such as Google Assistant, Siri, and Alexa, in our daily lives, replacing keyboard or touch interfaces. In order to facilitate the communication of a machines complex ideas to the human, the machine’s utterances need to be embellished with appropriate non-verbal behaviors. The platform acts as the eyes and ears for the AI agent, tracking the blocks on the table (Son et al, 2016) and multi-modal behaviors of the human interacting with it, both verbal and non-verbal (Siddique et al, 2015). It provides an embodiment of the machine in the form of a simple humanoid avatar for the users to interact with. This system is publicly available for use by the research community (Salter et al, 2017)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have