Abstract

We are motivated by the observation that for problems where inputs and outputs are in the same form such as in image enhancement, deep neural networks can be reinforced by retraining the network using a new target set to the output for the original target. As an example, we introduce a new learning strategy for super-resolution by recurrently training the same simple network. Unlike the existing self-trained SR, which involves a single stage of learning with multiple runs at test time, our method trains the same SR network multiple times with increasingly better targets requiring only a single inference at test time. At each stage of the proposed learning scheme, a new target for training is obtained by applying the most recently trained SR network to the original image and downscaling the resultant SR image to normalize the size. Even if downscaling is involved, we argue that the downscaled SR image acts as a better target compared to the old target. We could mathematically demonstrate that this process is similar to unsharp masking when it is linearly approximated and that this process makes the image sharper. However, unlike unsharp masking, the proposed recurrent learning tends to converge to a specific target. By retraining the existing network aiming at a more enhanced target, the proposed method can achieve a similar effect of applying SR multiple times without increasing implementation cost and inference time. To objectively verify the supremacy of our approach by experiments, we propose to use VIQET MOS, which does not require a reference image as a measure of image quality. As far as we know, our work of using an objective quality measure in image enhancement is the first one whose validity was verified by showing similar results to the actual user's subjective evaluation. The proposed recurrent learning scheme makes existing SR algorithms more useful by clearly improving the effect of SR. Codes are available at https://github.com/rtsr82/rtsr.git.

Highlights

  • Nowadays, FHD (1920 × 1080) videos are widely used in digital broadband systems, while UHD (3840 × 2160) televisions are prevalent

  • The contributions of this paper are as follows: 1) We experimentally show that if inputs and targets of a network are in the same form, a deep neural network can be reinforced by recurrently training it with a new target set to the output obtained by inputting the original target to the network

  • 2) As an example, we propose an idea to reinforce a deep neural networks (DNN) for super-resolution with its own output by adding a step of adjusting the size of the output same as the initial target, and we show the effectiveness of the proposed Recurrently-Trained Super-Resolution scheme through experiments

Read more

Summary

Introduction

FHD (1920 × 1080) videos are widely used in digital broadband systems, while UHD (3840 × 2160) televisions are prevalent. Even though original content may have been made and broadcasted in UHD, and very high-quality TVs with 8K resolution are available, it is hard to say that all the intermediate broadband systems support UHD as of 2020. In the general situation mentioned above, it is often necessary to upscale the FHD video input to the UHD TV output, and only 2× upscale is needed. In this case, the performance of the simple Lancoz scaler is not bad, and

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call