Abstract
Deep image prior (DIP), which uses a deep convolutional network (ConvNet) structure as an image prior, has attracted wide attention in computer vision and machine learning. DIP empirically shows the effectiveness of the ConvNet structures for various image restoration applications. However, why the DIP works so well is still unknown. In addition, the reason why the convolution operation is useful in image reconstruction, or image enhancement is not very clear. This study tackles this ambiguity of ConvNet/DIP by proposing an interpretable approach that divides the convolution into "delay embedding" and "transformation" (i.e., encoder-decoder). Our approach is a simple, but essential, image/tensor modeling method that is closely related to self-similarity. The proposed method is called manifold modeling in embedded space (MMES) since it is implemented using a denoising autoencoder in combination with a multiway delay-embedding transform. In spite of its simplicity, MMES can obtain quite similar results to DIP on image/tensor completion, super-resolution, deconvolution, and denoising. In addition, MMES is proven to be competitive with DIP, as shown in our experiments. These results can also facilitate interpretation/characterization of DIP from the perspective of a "low-dimensional patch-manifold prior."
Highlights
T HE most important piece of information for image/tensor restoration would be the “prior.” The prior usually converts the optimization problems from ill-posed to well-posed and/or enhances the robustness to specific noises and outliers
The low dimensionality of the patch manifold restricts the nonlocal similarity of the images/tensors, and it would be related to the “impedance” in Deep image prior (DIP)
The results are presented in comparison to the DIP and other selective methods for color-image inpainting, super-resolution, deconvolution, and denoising tasks
Summary
T HE most important piece of information for image/tensor restoration would be the “prior.” The prior usually converts the optimization problems from ill-posed to well-posed and/or enhances the robustness to specific noises and outliers. Ulyanov et al [61], [62] have reported a very interesting phenomenon of the fully convolutional generator network [convolutional network (ConvNet)] They claimed that the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning. The contributions in this study can be summarized as follows: 1) a new and simple approach for image/tensor modeling is proposed, which extends and simplifies the ConvNet as a combination of delay embedding and multilayer perceptron; 2) the effectiveness of the proposed method and its similarity to the DIP are demonstrated by extensive experiments; and 3) most importantly, there is a prospect for interpreting/characterizing the DIP as a “low-dimensional patch-manifold prior.”
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have