Linking human motions and objects to language for synthesizing action sentences

Wataru Takano,Yoshihiko Yamada,Yoshihiko Nakamura

doi:10.1007/s10514-018-9762-1

Abstract

This paper proposes a novel framework for generating action descriptions from human whole body motions and objects to be manipulated. This generation is based on three modules: the first module categorizes human motions and objects; the second module associates the motion and object categories with words; and the third module extracts a sentence structure as word sequences. Human motions and objects to be manipulated are classified into categories in the first module, then words highly relevant to the motion and object categories are generated from the second module, and finally the words are converted into sentences in the form of word sequences by the third module. The motions and objects along with the relations among the motions, objects, and words are parametrized stochastically by the first and second modules. The sentence structures are parametrized from a dataset of word sequences in a dynamical system by the third module. The link of the stochastic representation of the motions, objects, and words with the dynamical representation of the sentences allows for synthesizing sentences descriptive to human actions. We tested our proposed method on synthesizing action descriptions for a human action dataset captured by an RGB-D sensor, and demonstrated its validity.

Highlights

The demographic trend in advanced countries is that the percentage of elderly people is increasing, even as the total population is shrinking
This research has focused on increasing the integration density and accuracy of hardware technology, other elements are essential to constructing intelligent humanoid robots: software for obtaining external information corresponding to the five human senses, perceiving by using the obtained information, and controlling the motion of the robot
This paper proposes a link of human whole body motions, manipulation target objects and language for synthesizing sentence describing human actions

Summary

Introduction

The demographic trend in advanced countries is that the percentage of elderly people is increasing, even as the total population is shrinking. Takano and Nakamura (2015a, b) proposed a model that combines motion symbols characterized by HMMs with natural language, and developed a computation method for creating sentences that represent motions These motion recognition systems use only bodily motion information such as the three-dimensional position of each part of the body or the time-series data of joint angles, and it is anticipated that these systems will be extended to handle environment (a) for understanding actions in which meaning is imparted to human motion by interactions with the environment, and (b) for generating actions such as manipulation of objects in the environment. This is done with the aim of more correctly understanding human actions by using multimodal information comprising body motion information, such as three-dimensional position information of each body part and time-series data of joint angles, and the positions and types of objects in the environment with descriptive sentences representing the action

Motion and object primitives

Human whole body primitives

Object primitives

Connection between human actions and description

Stochastic model of words from motions and objects

Recurrent neural network for action descriptions

Generation of action descriptions from motions and objects

Experiments

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Autonomous Robots	Publication Date: May 2, 2018
Citations: 10	License type: open-access

R Discovery Prime

R Discovery Prime

Linking human motions and objects to language for synthesizing action sentences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Autonomous Robots

Lead the way for us

Similar Papers

Generation of action description from classification of motion and object
Wataru Takano ... Yoshihiko Nakamura
Robotics | VOL. 91
Wataru Takano, et. al.Wataru Takano ... Yoshihiko Nakamura
09 Feb 2017
Robotics | VOL. 91

Generating action descriptions from statistically integrated representations of human motions and sentences
Wataru Takano ... Yoshihiko Nakamura
Neural networks : the official journal of the International Neural Network Society | VOL. 80
Wataru Takano, et. al.Wataru Takano ... Yoshihiko Nakamura
16 Mar 2016
Neural networks : the official journal of the International Neural Network Society | VOL. 80

Upper limb movement simulation and biomechanical characteristics during human movement
Hao Wang
International Journal of Emerging Electric Power Systems | VOL. 23
Hao WangHao Wang
05 Sep 2022
International Journal of Emerging Electric Power Systems | VOL. 23

Sentence Generation from IMU-based Human Whole-Body Motions in Daily Life Behaviors
Wataru Takano
-
Wataru TakanoWataru Takano
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Linking human motions and objects to language for synthesizing action sentences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Autonomous Robots