Abstract

Occlusion and motion blur make it challenging to interpolate video frame, since estimating complex motions between two frames is hard and unreliable, especially in highly dynamic scenes. This paper aims to address these issues by exploiting spike stream as auxiliary visual information between frames to synthesize target frames. Instead of estimating motions by optical flow from RGB frames, we present a new dual-modal pipeline adopting both RGB frames and the corresponding spike stream as inputs (SVFI). It extracts the scene structure and objects' outline feature maps of the target frames from spike stream. Those feature maps are fused with the color and texture feature maps extracted from RGB frames to synthesize target frames. Benefited by the spike stream that contains consecutive information between two frames, SVFI can directly extract the information in occlusion and motion blur areas of target frames from spike stream, thus it is more robust than previous optical flow-based methods. Experiments show SVFI outperforms the SOTA methods on wide variety of datasets. For instance, in 7 and 15 frame skip evaluations, it shows up to 5.58 dB and 6.56 dB improvements in terms of PSNR over the corresponding second best methods BMBC and DAIN. SVFI also shows visually impressive performance in real-world scenes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call