Intrinsic Image Decomposition Based on Quantized Prior Codebook

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Intrinsic image decomposition is a low-level image processing task that extracts the reflectance and lighting components from an image. This process can improve the illumination robustness of perception tasks, such as object detection, recognition, and image understanding. Recently, deep image generation frameworks have been used to generate intrinsic images. However, the encoder and decoder lack prior knowledge constraints. This paper presents a quantized codebook for embedding intrinsic features that guide the extraction of intrinsic images. To enhance reconstruction accuracy, we propose a purification method to eliminate irrelevant elements from the codebook. Additionally, we propose self-attention and cross-attention modules to integrate the intrinsic features of the codebook into the input image features for reconstruction. The effectiveness of the algorithm is demonstrated through experiments conducted on several popular datasets.

Similar Papers
  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-030-34032-2_14
Intrinsic Face Image Decomposition from RGB Images with Depth Cues
  • Jan 1, 2019
  • Shirui Liu + 2 more

As a pre-step of reconstructing face attributes technology, the quality of face intrinsic image decomposition result has a direct impact on the sub-operations of reconstructing face attributes detail. There are two challenging problems with the intrinsic face image decomposition methods which are the quality of face-base intrinsic image, and the details of the shading image. In this study a new image model for intrinsic face image decomposition from RGB images with depth cues is proposed to produce high quality results even with simple constraints. The proposed model consists of three main steps: face cropping operation, processing the RGB color normalization, and the super-pixel segmentation. The face image is first cropped to get face area, then a color normalization process for the cropped face image is used to normalize RGB pixels, and finally the super-pixel segmentation based on mean shift algorithm is applied which has a good performance on reduce artifact and shading image’s detail retention. To evaluate the proposed model, both qualitative and quantitative assessments be used. The qualitative assessment is based on human subjective visual standards to compare the intrinsic images results, and the quantitative assessment is based on the data analyze of the image information entropy. Qualitative and quantitative results both demonstrate that the performance of the proposed model is better than other techniques in the field of intrinsic face image decomposition.

  • Research Article
  • Cite Count Icon 10
  • 10.1109/tgrs.2021.3102644
Intrinsic Hyperspectral Image Decomposition With DSM Cues
  • Jan 1, 2022
  • IEEE Transactions on Geoscience and Remote Sensing
  • Xudong Jin + 2 more

Intrinsic hyperspectral image decomposition (IHID) aims to recover physical scene properties such as reflectance and illumination from a given hyperspectral image (HSI), which directly respects the physical imaging process and can benefit many HSI processing tasks. It is a severely ill-posed problem and is challenging to solve using HSI alone. Additional geometric information provided by digital surface models (DSMs) can otherwise help immensely. While intrinsic image decomposition for RGB images and RGB-D images has been studied extensively during the past few decades and has seen significant progress, studies of the problem for other types of data, such as HSIs and DSMs, are still needed. It is much more challenging to handle an HSI with hundreds of channels than an RGB image with only three channels. Moreover, compared with RGB-D data, HSIs and DSM data usually have much lower spatial resolutions and more complicated land covers, making it difficult to extend the RGB-D intrinsic image method directly. In this article, we present a novel IHID framework for HSIs with DSM cues. Utilizing spherical-harmonic illumination, we first propose a convenient HSI rendering model with DSM, which describes the interplay of material reflectance, geometric distribution, and environment illumination. Then, we introduce local and nonlocal priors on reflectance that ensure the local smooth and global consistency of recovered reflectance. Experiments on synthetic and real data demonstrate that the proposed method outperforms the state-of-the-art methods and is robust to illumination changes.

  • Book Chapter
  • Cite Count Icon 14
  • 10.1007/978-3-319-71607-7_55
Intrinsic Image Decomposition: A Comprehensive Review
  • Jan 1, 2017
  • Yupeng Ma + 4 more

Image understanding and analysis is one of the important tasks in the image processing. Multiple factors influence the appearance of an object in an image. However, extracting the intrinsic images from the observer image can eliminate the environmental impact effectively and make the image understanding more accurately. The intrinsic images represent the inherent shape, color and texture information of the object. Intrinsic image decomposition is recovering shading image and reflectance image from a single input image and remains a challenging problem because of its severely ill-posed problem. In order to deal with these problems, researches have proposed various algorithms for decomposing the intrinsic image. In this paper we survey the recent advances in intrinsic image decomposition. First, we introduce the existing datasets for intrinsic image decomposition. Second, we introduce and analyze the existing intrinsic image decomposition algorithms. Finally, we use the existing algorithms to experiment on the intrinsic image datasets, and analyze and summarize the experimental results.

  • Research Article
  • Cite Count Icon 14
  • 10.1109/tcsvt.2020.3024687
CasQNet: Intrinsic Image Decomposition Based on Cascaded Quotient Network
  • Sep 23, 2020
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Yupeng Ma + 4 more

Intrinsic image analysis plays an important role for image understanding, since it can provide accurate reflectance, shape and illumination information of the scene. However, intrinsic image analysis is an ill-posed problem which need to apply extra constrains for the decomposition of reflectance image and shading image from a single image. Recently deep neural networks are introduced for intrinsic image analysis, which can produce two intrinsic components simultaneously. In fact, the mutually exclusive relationship between reflectance image and shading image is not only a constraint for decomposition but also can improve the decomposition results. However, this relationship is always omitted in the current networks. In order to address this problem, we propose a novel deep network called as Cascaded Quotient Network (CasQNet) for intrinsic image decomposition. The CasQNet consists of two sub-networks: a Pyramid Mini-U-Net (PyNet) that specifically extracts the reflectance image in multi-scale and a Shading Optimization Network (SoNet) that optimizes the resulting shading. These two sub-networks are cascaded by a quotient operation, which directly enforces the mutually exclusive relationship between reflectance image and shading image in the network architecture. In PyNet, the task of reconstructing reflectance image is achieved by a series of nested multi-scale U-Nets, which simplified the learning task for each U-Net. SoNet is designed to address the unsmooth and blur problems of extreme points caused by the quotient operation. PyNet and SoNet are trained alternately and finally jointed in cascaded structure. Furthermore, we combine multiple loss functions, which consist of data loss, correlation loss and reconstruction loss, for improving the learning effectiveness. To evaluate our proposed algorithm, extensive experiments are performed on three datasets, i.e., ShapeNet, BOLD Surface and MIT Intrinsic Image datasets. Qualitative and quantitative results show that our model achieves the best performance compared to the state-of-the-art methods.

  • Book Chapter
  • Cite Count Icon 81
  • 10.1007/978-3-319-10602-1_15
Intrinsic Face Image Decomposition with Human Face Priors
  • Jan 1, 2014
  • Chen Li + 2 more

We present a method for decomposing a single face photograph into its intrinsic image components. Intrinsic image decomposition has commonly been used to facilitate image editing operations such as relighting and re-texturing. Although current single-image intrinsic image methods are able to obtain an approximate decomposition, image operations involving the human face require greater accuracy since slight errors can lead to visually disturbing results. To improve decomposition for faces, we propose to utilize human face priors as constraints for intrinsic image estimation. These priors include statistics on skin reflectance and facial geometry. We also make use of a physically-based model of skin translucency to heighten accuracy, as well as to further decompose the reflectance image into a diffuse and a specular component. With the use of priors and a skin reflectance model for human faces, our method is able to achieve appreciable improvements in intrinsic image decomposition over more generic techniques.

  • Research Article
  • Cite Count Icon 461
  • 10.1145/2601097.2601206
Intrinsic images in the wild
  • Jul 27, 2014
  • ACM Transactions on Graphics
  • Sean Bell + 2 more

Intrinsic image decomposition separates an image into a reflectance layer and a shading layer. Automatic intrinsic image decomposition remains a significant challenge, particularly for real-world scenes. Advances on this longstanding problem have been spurred by public datasets of ground truth data, such as the MIT Intrinsic Images dataset. However, the difficulty of acquiring ground truth data has meant that such datasets cover a small range of materials and objects. In contrast, real-world scenes contain a rich range of shapes and materials, lit by complex illumination. In this paper we introduce Intrinsic Images in the Wild , a large-scale, public dataset for evaluating intrinsic image decompositions of indoor scenes. We create this benchmark through millions of crowdsourced annotations of relative comparisons of material properties at pairs of points in each scene. Crowdsourcing enables a scalable approach to acquiring a large database, and uses the ability of humans to judge material comparisons, despite variations in illumination. Given our database, we develop a dense CRF-based intrinsic image algorithm for images in the wild that outperforms a range of state-of-the-art intrinsic image algorithms. Intrinsic image decomposition remains a challenging problem; we release our code and database publicly to support future research on this problem, available online at http://intrinsic.cs.cornell.edu/.

  • Research Article
  • 10.1609/aaai.v39i10.33151
When Shadow Removal Meets Intrinsic Image Decomposition: A Joint Learning Framework Using Unpaired Data
  • Apr 11, 2025
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Rongjia Zheng + 3 more

We present a framework that achieves shadow removal by learning intrinsic image decomposition (IID) from unpaired shadow and shadow-free images. Although it is well-known that intrinsic images, \ie, illumination and reflectance, are highly beneficial to shadow removal, IID is rarely adopted by previous work due to its inherent ambiguity and the scarcity of training data. However, we find that by properly coupling shadow removal and IID into a joint learning framework, they can reinforce each other and enable promising results on both tasks, even with unpaired training data. Our framework is comprised of an IID network for separating the shadow input image into illumination and reflectance, and an illumination recovery network for predicting shadow-free illumination with which we are able to produce the shadow removal output by recombining with the estimated reflectance. We perform extensive experiments on various benchmark datasets to demonstrate the effectiveness of our method in shadow removal, and also showcase our advantage over previous IID methods in handling images with complex shadows.

  • Research Article
  • Cite Count Icon 5
  • 10.1111/cgf.12874
Intrinsic Image Decomposition Using Multi‐Scale Measurements and Sparsity
  • Jun 6, 2016
  • Computer Graphics Forum
  • Shouhong Ding + 4 more

Automatic decomposition of intrinsic images, especially for complex real‐world images, is a challenging under‐constrained problem. Thus, we propose a new algorithm that generates and combines multi‐scale properties of chromaticity differences and intensity contrast. The key observation is that the estimation of image reflectance, which is neither a pixel‐based nor a region‐based property, can be improved by using multi‐scale measurements of image content. The new algorithm iteratively coarsens a graph reflecting the reflectance similarity between neighbouring pixels. Then multi‐scale reflectance properties are aggregated so that the graph reflects the reflectance property at different scales. This is followed by a L0 sparse regularization on the whole reflectance image, which enforces the variation in reflectance images to be high‐frequency and sparse. We formulate this problem through energy minimization which can be solved efficiently within a few iterations. The effectiveness of the new algorithm is tested with the Massachusetts Institute of Technology (MIT) dataset, the Intrinsic Images in the Wild (IIW) dataset, and various natural images.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tpami.2022.3224253
Intrinsic Image Transfer for Illumination Manipulation.
  • Jun 1, 2023
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Junqing Huang + 3 more

This paper presents a novel intrinsic image transfer (IIT) algorithm for image illumination manipulation, which creates a local image translation between two illumination surfaces. This model is built on an optimization-based framework composed of illumination, reflectance and content photo-realistic losses, respectively. Each loss is firstly defined on the corresponding sub-layers factorized by an intrinsic image decomposition and then reduced under the well-known spatial-varying illumination illumination-invariant reflectance prior knowledge. We illustrate that all losses, with the aid of an "exemplar" image, can be directly defined on images without the necessity of taking an intrinsic image decomposition, thereby giving a closed-form solution to image illumination manipulation. We also demonstrate its versatility and benefits to several illumination-related tasks: illumination compensation, image enhancement and tone mapping, and high dynamic range (HDR) image compression, and show their high-quality results on natural image datasets.

  • Book Chapter
  • Cite Count Icon 110
  • 10.1007/978-3-319-46484-8_9
Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields
  • Jan 1, 2016
  • Seungryong Kim + 3 more

We present a method for jointly predicting a depth map and intrinsic images from single-image input. The two tasks are formulated in a synergistic manner through a joint conditional random field (CRF) that is solved using a novel convolutional neural network (CNN) architecture, called the joint convolutional neural field (JCNF) model. Tailored to our joint estimation problem, JCNF differs from previous CNNs in its sharing of convolutional activations and layers between networks for each task, its inference in the gradient domain where there exists greater correlation between depth and intrinsic images, and the incorporation of a gradient scale network that learns the confidence of estimated gradients in order to effectively balance them in the solution. This approach is shown to surpass state-of-the-art methods both on single-image depth estimation and on intrinsic image decomposition.

  • Conference Article
  • Cite Count Icon 16
  • 10.1109/iciea.2016.7603567
Contactless fingerprint enhancement via intrinsic image decomposition and guided image filtering
  • Jun 1, 2016
  • Xuefei Yin + 2 more

Although contactless fingerprint images are rarely affected by skin conditions and finger pressure in comparison with touch-based fingerprint images, they are usually noisy and suffer from low ridge-valley contrast. This paper proposes a robust contactless fingerprint enhancement method based on intrinsic image decomposition and guided image filtering. In order to strengthen the contrast of ridge and valley, intrinsic image decomposition is firstly performed on the observed fingerprint image. Then, the obtained intrinsic fingerprint image is used as the guided image to filter the observed fingerprint image, which can efficiently eliminate noise while preserving the ridge-valley information. Finally, an improved Gabor-based contextual filter is adopted to further enhance the fingerprint image quality. Experimental results of minutiae extraction based on the enhanced fingerprint image demonstrate the validity of the proposed method.

  • Conference Article
  • Cite Count Icon 194
  • 10.1109/cvpr.2018.00942
Learning Intrinsic Image Decomposition from Watching the World
  • Jun 1, 2018
  • Zhengqi Li + 1 more

Single-view intrinsic image decomposition is a highly ill-posed problem, and so a promising approach is to learn from large amounts of data. However, it is difficult to collect ground truth training data at scale for intrinsic images. In this paper, we explore a different approach to learning intrinsic images: observing image sequences over time depicting the same scene under changing illumination, and learning single-view decompositions that are consistent with these changes. This approach allows us to learn without ground truth decompositions, and to instead exploit information available from multiple images when training. Our trained model can then be applied at test time to single views. We describe a new learning framework based on this idea, including new loss functions that can be efficiently evaluated over entire sequences. While prior learning-based methods achieve good performance on specific benchmarks, we show that our approach generalizes well to several diverse datasets, including MIT intrinsic images, Intrinsic Images in the Wild and Shading Annotations in the Wild.1

  • Research Article
  • Cite Count Icon 18
  • 10.1016/j.cviu.2021.103183
Physics-based shading reconstruction for intrinsic image decomposition
  • Feb 13, 2021
  • Computer Vision and Image Understanding
  • Anil S Baslamisli + 3 more

We investigate the use of photometric invariance and deep learning to compute intrinsic images (albedo and shading). We propose albedo and shading gradient descriptors which are derived from physics-based models. Using the descriptors, albedo transitions are masked out and an initial sparse shading map is calculated directly from the corresponding RGB image gradients in a learning-free unsupervised manner. Then, an optimization method is proposed to reconstruct the full dense shading map. Finally, we integrate the generated shading map into a novel deep learning framework to refine it and also to predict corresponding albedo image to achieve intrinsic image decomposition. By doing so, we are the first to directly address the texture and intensity ambiguity problems of the shading estimations. Large scale experiments show that our approach steered by physics-based invariant descriptors achieve superior results on MIT Intrinsics, NIR-RGB Intrinsics, Multi-Illuminant Intrinsic Images, Spectral Intrinsic Images, As Realistic As Possible, and competitive results on Intrinsic Images in the Wild datasets while achieving state-of-the-art shading estimations.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tvcg.2024.3366343
Intrinsic Omnidirectional Image Decomposition With Illumination Pre-Extraction.
  • Jul 1, 2024
  • IEEE transactions on visualization and computer graphics
  • Rong-Kai Xu + 2 more

Capturing an omnidirectional image with a 360-degree field of view entails capturing intricate spatial and lighting details of the scene. Consequently, existing intrinsic image decomposition methods face significant challenges when attempting to separate reflectance and shading components from a low dynamic range (LDR) omnidirectional images. To address this, our article introduces a novel method specifically designed for the intrinsic decomposition of omnidirectional images. Leveraging the unique characteristics of the 360-degree scene representation, we employ a pre-extraction technique to isolate specific illumination information. Subsequently, we establish new constraints based on these extracted details and the inherent characteristics of omnidirectional images. These constraints limit the illumination intensity range and incorporate spherical-based illumination variation. By formulating and solving an objective function that accounts for these constraints, our method achieves a more accurate separation of reflectance and shading components. Comprehensive qualitative and quantitative evaluations demonstrate the superiority of our proposed method over state-of-the-art intrinsic decomposition methods.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3571600.3571603
Interpreting Intrinsic Image Decomposition using Concept Activations
  • Dec 8, 2022
  • Avani Gupta + 2 more

Evaluation of ill-posed problems like Intrinsic Image Decomposition (IID) is challenging. IID involves decomposing an image into its constituent illumination-invariant Reflectance (R) and albedo-invariant Shading (S) components. Contemporary IID methods use Deep Learning models and require large datasets for training. The evaluation of IID is carried out on either synthetic Ground Truth images or sparsely annotated natural images. A scene can be split into reflectance and shading in multiple, valid ways. Comparison with one specific decomposition in the ground-truth images used by current IID evaluation metrics like LMSE, MSE, DSSIM, WHDR, SAW AP%, etc., is inadequate. Measuring R-S disentanglement is a better way to evaluate the quality of IID. Inspired by ML interpretability methods, we propose Concept Sensitivity Metrics (CSM) that directly measure disentanglement using sensitivity to relevant concepts. Activation vectors for albedo invariance and illumination invariance concepts are used for the IID problem. We evaluate and interpret three recent IID methods on our synthetic benchmark of controlled albedo and illumination invariance sets. We also compare our disentanglement score with existing IID evaluation metrics on both natural and synthetic scenes and report our observations. Our code and data are publicly available for reproducibility 1.

Save Icon
Up Arrow
Open/Close