SCSE-E2VID: Improved event-based video reconstruction with an event camera

Yue Lu,Dianxi Shi,Ruihao Li,Shaowu Yang,Yi Zhang,Luoxi Jing

doi:10.1109/smc53654.2022.9945237

Abstract

The recently emerging event camera has grown into a new type of sensor in the realm of vision, with benefits such as low power consumption, high dynamic range (HDR), microsecond time resolution, and no motion blur. While event cameras offer numerous advantages over conventional cameras, they only capture changes in intensity and give up lots of environmental details. This paper proposes an end-to-end UNet network called SCSE-E2VID to synthesize gray images from asynchronous events. We design an event fusion block to feed more related events to the encoder, allowing the network to extract more valuable features. The famous attention module called Spatial and Channel ‘Squeeze & Excitation’ Block (SCSE) is utilized to remove artifacts and better extract spatiotemporal features for the decoder. Besides, we add parallel convolutions in the upsampling block and refine the output features, which supplement content in reduced channels. In order to evaluate the performance of our proposed SCSE-E2VID, we implement quantitative and qualitative comparisons based on the public IJRR and HQF datasets. The results show that our method achieves better performance in terms of perceptual similarity and structural similarity when compared with state-of-art methods and demonstrates comparable performance in terms of squared error.

Full Text