Abstract

An edge map is a feature map representing the contours of the object in the image. There was a Single Image Super Resolution (SISR) method using the edge map, which achieved a notable SSIM performance improvement. Unlike SISR, Video Super Resolution (VSR) uses video, which consists of consecutive images with temporal features. Therefore, some VSR models adopted motion estimation and motion compensation to apply spatio-temporal feature maps. Unlike the models above, we tried a different method by adding edge structure information and its related post-processing to the existing model. Our model “Video Super Resolution Using a Selective Edge Aggregation Network (SEAN)” consists of a total of two stages. First, the model selectively generates an edge map using the target frame and also the neighboring frame. At this stage, we adopt the magnitude loss function so that the output of SEAN more clearly learns the contours of each object. Second, the final output is generated using the refinement (post-processing) module. SEAN shows more distinct object contours and better color correction compared to other existing models.

Highlights

  • The Video Super Resolution (VSR) task is usually processed using information from consecutive frames

  • In the case of a neighboring frame with little change in motion, there is no significant difference in the edge maps obtained in each frame

  • The proposed refinement was applied to the VSR (Video Super Resolution) field for the first time, and since it was designed with high versatility, it is compatible with other VSR models according to the user’s choice

Read more

Summary

Introduction

The VSR task is usually processed using information from consecutive frames. To use this information, some VSR models [1,2] adopted motion estimation and motion compensation to apply spatio-temporal features. Before generating an edge map, it is necessary to detect the motion difference between neighboring frames. In the case of a neighboring frame with little change in motion, there is no significant difference in the edge maps obtained in each frame. Our model uses Canny edge detection to generate an edge map of each frame, except blur frames. Because artifacts exist in some frames of the TestSet, such as Vid4 [3], incomplete areas are generated To solve this problem, there has been an attempt to use correlation between frames [4]. This process looks similar to the refining process in the classic vision field called post-processing, but it has a different character It uses deep learning, and in particular, it is possible to minimize information loss, a classic problem of convolution operation. The proposed refinement was applied to the VSR (Video Super Resolution) field for the first time, and since it was designed with high versatility, it is compatible with other VSR models according to the user’s choice

Single Image Super Resolution
Video Super Resolution
Post-Processing
Method
Edge Map Extraction
Feature Extraction
Back Projection
Reconstruction
Refinement
Loss Function
Experiment
Ablation Study
Comparison with State-of-the-Art Methods
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call