Abstract

Robotic grasping plays an important role in the field of robotics. The current state-of-the-art robotic grasping detection systems are usually built on the conventional vision, such as the RGB-D camera. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research. Currently, there are limited event-based datasets due to the troublesome annotation of the asynchronous event stream. Annotating large scale vision datasets often takes lots of computation resources, especially when it comes to troublesome data for video-level annotation. In this work, we consider the problem of detecting robotic grasps in a moving camera view of a scene containing objects. To obtain more agile robotic perception, a neuromorphic vision sensor (Dynamic and Active-pixel Vision Sensor, DAVIS) attaching to the robot gripper is introduced to explore the potential usage in grasping detection. We construct a robotic grasping dataset named Event-Grasping dataset with 91 objects. A spatial-temporal mixed particle filter (SMP Filter) is proposed to track the LED-based grasp rectangles, which enables video-level annotation of a single grasp rectangle per object. As LEDs blink at high frequency, the Event-Grasping dataset is annotated at a high frequency of 1 kHz. Based on the Event-Grasping dataset, we develop a deep neural network for grasping detection that considers the angle learning problem as classification instead of regression. The method performs high detection accuracy on our Event-Grasping dataset with 93% precision at an object-wise level split. This work provides a large-scale and well-annotated dataset and promotes the neuromorphic vision applications in agile robot.

Highlights

  • Neuromorphic vision based on neuromorphic sensors represents the visual information in the way of address-event-representation (AER)

  • The reflection noise and the cross-impact noise will destroy the tracking results due to the uncertainty of the object surface and multiple frequencies. These noises can be filtered by an Spatial-temporal Mixed Particle (SMP) filter, and the results are shown Figure 5

  • We evaluated our grasping detection algorithm on the EventGrasping dataset, which is recorded with neuromorphic vision sensor (DAVIS) in two different light conditions, light and dark

Read more

Summary

INTRODUCTION

Neuromorphic vision based on neuromorphic sensors represents the visual information in the way of address-event-representation (AER). It is difficult to get a balance between high-complexity perception algorithms and low computation, storage, and power consumption in an embedded robot system These problems deteriorate especially when grasping a moving object. The neuromorphic vision sensor is seldom applied in the field of robotics since it is difficult to annotate neuromorphic vision datasets with data format of asynchronous event stream. Apart from the low latency, data storage and computational resources are drastically reduced due to the sparse event stream Another key property is its very high dynamic range, which is 130 vs 60 dB of frame-based vision sensors. We have created a robotic grasping dataset named “Event-Grasping Dataset” by directly shooting the real world with a neuromorphic vision sensor (DAVIS) and making a label for the asynchronous event stream.

System Setting
Synchronization Problem
LED MARKER TRACKING
Event Data Formulation
Spatiotemporal Mixed Particle Filter
Temporal Evidence
Spatial Evidence
Tracking Results
EVENT-GRASPING DATASET
Base Dataset
Annotation Dataset
GRASPING DETECTION METHOD
Data Pre-processing
Proposals Detection With Multi-Scale Feature Map
Classification for Grasping Orientation
Loss Function
EXPERIMENTS AND RESULTS
Training
Metrics
Results
Discussions
CONCLUSION
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call