Linguistic Instruction Research Articles

Vision-Language Navigation (VLN) is a challenging task which requires an agent to align complex visual observations to language instructions to reach the goal position. Most existing VLN agents directly learn to align the raw directional features and visual features trained using one-hot labels to linguistic instruction features. However, the big semantic gap among these multi-modal inputs makes the alignment difficult and therefore limits the navigation performance. In this paper, we propose Actional Atomic-Concept Learning (AACL), which maps visual observations to actional atomic concepts for facilitating the alignment. Specifically, an actional atomic concept is a natural language phrase containing an atomic action and an object, e.g., ``go up stairs''. These actional atomic concepts, which serve as the bridge between observations and instructions, can effectively mitigate the semantic gap and simplify the alignment. AACL contains three core components: 1) a concept mapping module to map the observations to the actional atomic concept representations through the VLN environment and the recently proposed Contrastive Language-Image Pretraining (CLIP) model, 2) a concept refining adapter to encourage more instruction-oriented object concept extraction by re-ranking the predicted object concepts by CLIP, and 3) an observation co-embedding module which utilizes concept representations to regularize the observation representations. Our AACL establishes new state-of-the-art results on both fine-grained (R2R) and high-level (REVERIE and R2R-Last) VLN benchmarks. Moreover, the visualization shows that AACL significantly improves the interpretability in action decision. Code will be available at https://gitee.com/mindspore/models/tree/master/research/cv/VLN-AACL.

Read full abstract

To work cooperatively with humans by using language, robots must not only acquire a mapping between language and their behavior but also autonomously utilize the mapping in appropriate contexts of interactive tasks online. To this end, we propose a novel learning method linking language to robot behavior by means of a recurrent neural network. In this method, the network learns from correct examples of the imposed task that are given not as explicitly separated sets of language and behavior but as sequential data constructed from the actual temporal flow of the task. By doing this, the internal dynamics of the network models both language–behavior relationships and the temporal patterns of interaction. Here, “internal dynamics” refers to the time development of the system defined on the fixed-dimensional space of the internal states of the context layer. Thus, in the execution phase, by constantly representing where in the interaction context it is as its current state, the network autonomously switches between recognition and generation phases without any explicit signs and utilizes the acquired mapping in appropriate contexts. To evaluate our method, we conducted an experiment in which a robot generates appropriate behavior responding to a human’s linguistic instruction. After learning, the network actually formed the attractor structure representing both language–behavior relationships and the task’s temporal pattern in its internal dynamics. In the dynamics, language–behavior mapping was achieved by the branching structure. Repetition of human’s instruction and robot’s behavioral response was represented as the cyclic structure, and besides, waiting to a subsequent instruction was represented as the fixed-point attractor. Thanks to this structure, the robot was able to interact online with a human concerning the given task by autonomously switching phases.

Read full abstract

Linguistic Instruction Research Articles

Related Topics

Articles published on Linguistic Instruction

Implicit and explicit commonsense for multi-sentence video captioning

Natural language instructions induce compositional generalization in networks of neurons

DSG-GAN: Multi-turn text-to-image synthesis via dual semantic-stream guidance with global and local linguistics

Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation

Active exploration based on information gain by particle filter for efficient spatial concept formation

Embedding Explicit Linguistic Instruction in an SRSD Writing Intervention

Enhancing language skills through English novel instruction

Русский как иностранный (РКИ): несогласованность в субстантивных словосочетаниях с предлогами – трудности и подходы к решениям

Ethereum smart contracts: Analysis and statistics of their source code and opcodes

The effect of linguistic comprehension instruction on generalized language and reading comprehension skills: A systematic review.

Is Language Necessary for the Social Transmission of Lithic Technology?

ENGLISH FOR SPECIFIC PURPOSES AS A LINGUISTIC RESPONSE TO GLOBALIZATION

Paired Recurrent Autoencoders for Bidirectional Translation Between Robot Actions and Linguistic Descriptions

Critical SFL Praxis With Bilingual Youth: Disciplinary Instruction in a Third Space

Multicomponent Linguistic Awareness Intervention for At-Risk Kindergarteners

Dynamical Integration of Language and Behavior in a Recurrent Neural Network for Human-Robot Interaction.

Comparing two types of explicit pronunciation instructions on second language accentedness

Causal learning from probabilistic events in 24-month-olds: an action measure.

Teaching Mathematics through VerbalLinguistic Intelligence

Educational implications of the deficit/deprivation hypothesis in L2 situations: a case of Zimbabwe

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Linguistic Instruction Research Articles

Related Topics

Articles published on Linguistic Instruction

Implicit and explicit commonsense for multi-sentence video captioning

Natural language instructions induce compositional generalization in networks of neurons

DSG-GAN: Multi-turn text-to-image synthesis via dual semantic-stream guidance with global and local linguistics

Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation

Active exploration based on information gain by particle filter for efficient spatial concept formation

Embedding Explicit Linguistic Instruction in an SRSD Writing Intervention

Enhancing language skills through English novel instruction

Русский как иностранный (РКИ): несогласованность в субстантивных словосочетаниях с предлогами – трудности и подходы к решениям

Ethereum smart contracts: Analysis and statistics of their source code and opcodes

The effect of linguistic comprehension instruction on generalized language and reading comprehension skills: A systematic review.

Is Language Necessary for the Social Transmission of Lithic Technology?

ENGLISH FOR SPECIFIC PURPOSES AS A LINGUISTIC RESPONSE TO GLOBALIZATION

Paired Recurrent Autoencoders for Bidirectional Translation Between Robot Actions and Linguistic Descriptions

Critical SFL Praxis With Bilingual Youth: Disciplinary Instruction in a Third Space

Multicomponent Linguistic Awareness Intervention for At-Risk Kindergarteners

Dynamical Integration of Language and Behavior in a Recurrent Neural Network for Human-Robot Interaction.

Comparing two types of explicit pronunciation instructions on second language accentedness

Causal learning from probabilistic events in 24-month-olds: an action measure.

Teaching Mathematics through VerbalLinguistic Intelligence

Educational implications of the deficit/deprivation hypothesis in L2 situations: a case of Zimbabwe