Abstract
We present KnoWDiaL, an approach for Learning and using task-relevant Knowledge from human-robot Dialog and access to the Web. KnoWDiaL assumes that there is an autonomous agent that performs tasks, as requested by humans through speech. The agent needs to “understand” the request, (i.e., to fully ground the task until it can proceed to plan for and execute it). KnoWDiaL contributes such understanding by using and updating a Knowledge Base, by dialoguing with the user, and by accessing the web. We believe that KnoWDiaL, as we present it, can be applied to general autonomous agents. However, we focus on our work with our autonomous collaborative robot, CoBot, which executes service tasks in a building, moving around and transporting objects between locations. Hence, the knowledge acquired and accessed consists of groundings of language to robot actions, and building locations, persons, and objects. KnoWDiaL handles the interpretation of voice commands, is robust regarding speech recognition errors, and is able to learn commands involving referring expressions in an open domain, (i.e., without requiring a lexicon). We present in detail the multiple components of KnoWDiaL, namely a frame-semantic parser, a probabilistic grounding model, a web-based predicate evaluator, a dialog manager, and the weighted predicate-based Knowledge Base. We illustrate the knowledge access and updates from the dialog and Web access, through detailed and complete examples. We further evaluate the correctness of the predicate instances learned into the Knowledge Base, and show the increase in dialog efficiency as a function of the number of interactions. We have extensively and successfully used KnoWDiaL in CoBot dialoguing and accessing the Web, and extract a few corresponding example sequences from captured videos.
Highlights
Speech-based interaction holds the promise of enabling robots to become both flexible and intuitive to use
We have presented KnoWDiaL, an approach for a robot to use and learn task-relevant knowledge from human-robot dialog and access to the World Wide Web
We have introduced the underlying joint probabilistic model consisting of a speech model, a parsing model, and a grounding model
Summary
Speech-based interaction holds the promise of enabling robots to become both flexible and intuitive to use. When the robot is a mobile robot servicing people, speech-based interaction will have to deal with tasks involving locations and objects in the environment. A human might command a robot like CoBot to “go to Dana’s office” or to “get me a coffee”. The mobile robot must infer the type of action it should take, the corresponding location parameters and the mentioned object. If we place no restrictions on speech, interpreting and executing a command becomes a challenging problem for several reasons. The robot may not have the knowledge necessary to execute the command in this particular environment.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.