A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots.

Marco Gutiérrez,Luis Manso,Harit Pandya,Pedro Núñez

doi:10.3390/s17020353

Marco Gutiérrez, Luis Manso + Show 2 more

Open Access

https://doi.org/10.3390/s17020353

Copy DOI

Abstract

Object detection and classification have countless applications in human–robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches.

Highlights

The possibilities and applications of autonomous social robots and ambient intelligence systems in our daily activities can be of utmost help
The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics
Semantic Vector (SV) is computed SV. These average vectors consider all labels in a certain location as a whole, minimizing the effect of false positives provided by the Convolutional Neural Network (CNN) step in the final object search

Summary

Introduction

The possibilities and applications of autonomous social robots and ambient intelligence systems in our daily activities can be of utmost help. The main contribution of the PLSA is that it improves the performance of social robots and autonomous systems in the task of finding objects in large environments through the combination of image labeling with word semantics. It uses multimodal information—language semantic information from trained models combined with visual input data—in order to estimate the most likely location for any given object. The PLSA takes advantage of its multimodality and combines these images with language semantic information to make early predictions on possible object locations This way, the social robot is able to optimize the path to a successful search when searching for objects. For an easier understanding of the abbreviations used along the explanation of this work, a reference list is included at the end of the manuscript

State-of-the-Art

Passive Learning Sensor Architecture

Cognitive Attention

Cognitive Subtraction

CNN Classification Step

Semantic Processing

Experiments

Tests on Image Buffering

Cognitive Attention Tests

Tests with Networks with Generic ImageNet Training

Tests with Networks with Fine-Tuned Training Datasets

Findings

Conclusions and Future Work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Feb 11, 2017
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Are friends electric? The benefits and risks of human-robot relationships.
Tony J Prescott ... Julie M Robillard
iScience | VOL. 24
Tony J Prescott, et. al.Tony J Prescott ... Julie M Robillard
26 Dec 2020
iScience | VOL. 24

A biologically inspired decision-making system for the autonomous adaptive behavior of social robots
Marcos Maroto-Gómez ... Miguel Ángel Salichs
Complex & Intelligent Systems | VOL. 9
Marcos Maroto-Gómez, et. al.Marcos Maroto-Gómez ... Miguel Ángel Salichs
29 May 2023
Complex & Intelligent Systems | VOL. 9

A Systematic Literature Review of Decision-Making and Control Systems for Autonomous and Social Robots
Marcos Maroto-Gómez ... Álvaro Castro-González
International journal of social robotics | VOL. 15
Marcos Maroto-Gómez, et. al.Marcos Maroto-Gómez ... Álvaro Castro-González
26 Mar 2023
International journal of social robotics | VOL. 15

Socially constrained management of power resources for social mobile robots
Amol Deshmukh ... Ruth Aylett
-
Amol Deshmukh, et. al.Amol Deshmukh ... Ruth Aylett
05 Mar 2012
05 Mar 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)