Abstract

Event cameras are bio-inspired sensors that produce sparse and asynchronous event streams instead of frame-based images at a high-rate. Recent works utilizing graph convolutional networks (GCNs) have achieved remarkable performance in recognition tasks, which model event stream as spatio-temporal graph. However, the computational mechanism of graph convolution introduces redundant computation when aggregating neighbor features, which limits the low-latency nature of the events. And they perform a synchronous inference process, which can not achieve a fast response to the asynchronous event signals. This paper proposes a local-shift graph convolutional network (LSNet), which utilizes a novel local-shift operation equipped with a local spatio-temporal attention component to achieve efficient and adaptive aggregation of neighbor features. To improve the efficiency of pooling operation in feature extraction, we design a node-importance based parallel pooling method (NIPooling) for sparse and low-latency event data. Based on the calculated importance of each node, NIPooling can efficiently obtain uniform sampling results in parallel, which retains the diversity of event streams. Furthermore, for achieving a fast response to asynchronous event signals, an asynchronous event processing procedure is proposed to restrict the network nodes which need to recompute activations only to those affected by the new arrival event. Experimental results show that the computational cost can be reduced by nearly 9 times through using local-shift operation and the proposed asynchronous procedure can further improve the inference efficiency, while achieving state-of-the-art performance on gesture recognition and object recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call