Abstract

We present a bidirectional transformer network to exploit the long-range informative dependency both in temporal and spatial domains for video deblurring. Motivated by the fact that the optical flow is related to the latent frames rather than blurry ones in the degradation process, we first develop a pre-deblur module to generate initial latent frames so that we can use these initial latent frames to estimate optical flow for better exploring the temporal information. Then we propose an effective bidirectional transformer to explore the long-range information, where it first aggregates the temporal information of the whole sequence through backward and forward propagation with the estimated optical flow, then a recurrent pixel-wise neighbor transformer (PWNT) block is developed at the end of the module to extract the useful spatial information. We embed our bidirectional transformer into a deep convolutional neural network and evaluate it on the publicly available video deblurring benchmarks. Extensive experimental results show that the proposed method performs favorably against the state-of-the-art methods. The implementation is available in this website: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/Rebeccaxq/pwnt</uri>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call