Abstract

In the last few years, there has been a steadily growing interest in autonomous vehicles and robotic systems. While many of these agents are expected to have limited resources, these systems should be able to dynamically interact with other objects in their environment. We present an approach where lightweight sensory and processing techniques, requiring very limited memory and processing power, can be successfully applied to the task of object retrieval using sensors of different modalities. We use the Hough framework to fuse optical and orientation information of the different views of the objects. In the presented spatio-temporal perception technique, we apply active vision, where, based on the analysis of initial measurements, the direction of the next view is determined to increase the hit-rate of retrieval. The performance of the proposed methods is shown on three datasets loaded with heavy noise.

Highlights

  • The visual surveillance and recognition of 3D objects by unmanned aerial vehicles (UAVs) and by autonomous robots, and the replacement of bar/matrix codes with the objects’ views, are key technologies for future applications

  • There are two main topics that are closely related to our approach: video-based or multi-view object recognition and active vision, which is often found in robotics-related journals and conferences

  • The main contribution of this article, besides the analysis of the contribution of orientation information in the Hough framework, consists in showing that the proposed technique can be used for active perception and that very lightweight techniques can be efficiently used for 3D object retrieval

Read more

Summary

Introduction

The visual surveillance and recognition of 3D objects by unmanned aerial vehicles (UAVs) and by autonomous robots, and the replacement of bar/matrix codes (used for the identification of products) with the objects’ views, are key technologies for future applications. To achieve a high hit-rate, we continuously processed the images of the video camera while the object was targeted, measured the orientation of the camera, and evaluated the possible candidates with the help of compact visual descriptors This approach can be implemented in a dynamic recognition model, where even the orientation of new image queries can lead to faster and more accurate recognition. There has been a significant evolution in computer vision with the involvement of deep neural networks (DNNs), especially with convolutional neural networks (CNNs) for object detection and recognition These techniques can be considered as appearance-based approaches and can usually reach high recognition rates in many scenarios.

Related Papers
Viewer-Centered Visual Representation with Orientation Data
Active Retrieval to Minimize Ambiguity
Experiments and Analysis of Results
Datasets
The Role of Orientation
Active Vision for Retrieval
Running Time and Memory Requirements
Findings
Discussion and Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call