In this paper, we propose a deep learning based sensor-driven method for online video stabilization. This method utilizes euler angle and acceleration estimated from the gyroscope and accelerator to assist stable video reconstruction. We introduce two simple sub-networks for trajectory optimization, the first network exploits real unstable trajectory and acceleration of camera to detect shooting scenarios. This network also generates an attention mask to choose scenario-specific features adaptively. Then the second network predicts smooth camera paths based on real unstable trajectory using LSTM under the supervision of the former mask. The output of trajectory optimization network is filtered with a two-step modification process to guarantee the smoothness. The real and smoothed camera paths are then utilized as guidance to projectively generate stable frames. We also capture videos with sensor data covering seven typical shooting scenarios, and design a groundtruth generation method to construct pseud-labels. Moreover, the trajectory smoothing network allows 3 or 10 frames buffer as the future information to construct a lookahead filter. Experimental results show that our online method could outperform other offline state-of-the-art methods in several shaky video clips with fewer buffer frames over both general and low-quality videos. Furthermore, our method could effectively reduce running times without image content analysis, and the stabilization efficiency reaches 25 fps on 1080p videos.
Read full abstract