Joint End-to-End Image Compression and Denoising: Leveraging Contrastive Learning and Multi-Scale Self-Onns
Noisy images are a challenge to image compression algorithms due to the inherent difficulty of compressing noise. As noise cannot easily be discerned from image details, such as high-frequency signals, its presence leads to extra bits needed for compression. Since the emerging learned image compression paradigm enables end-to-end optimization of codecs, recent efforts were made to integrate denoising into the compression model, relying on clean image features to guide denoising. However, these methods exhibit suboptimal performance under high noise levels, lacking the capability to generalize across diverse noise types. In this paper, we propose a novel method integrating a multi-scale denoiser comprising of Self Organizing Operational Neural Networks, for joint image compression and denoising. We employ contrastive learning to boost the network ability to differentiate noise from high frequency signal components, by emphasizing the correlation between noisy and clean counterparts. Experimental results demonstrate the effectiveness of the proposed method both in ratedistortion performance, and codec speed, outperforming the current state-of-the-art.
- Research Article
11
- 10.3389/frsip.2022.932873
- Sep 2, 2022
- Frontiers in Signal Processing
When it comes to image compression in digital cameras, denoising is traditionally performed prior to compression. However, there are applications where image noise may be necessary to demonstrate the trustworthiness of the image, such as court evidence and image forensics. This means that noise itself needs to be coded, in addition to the clean image itself. In this paper, we present a learning-based image compression framework where image denoising and compression are performed jointly. The latent space of the image codec is organized in a scalable manner such that the clean image can be decoded from a subset of the latent space (the base layer), while the noisy image is decoded from the full latent space at a higher rate. Using a subset of the latent space for the denoised image allows denoising to be carried out at a lower rate. Besides providing a scalable representation of the noisy input image, performing denoising jointly with compression makes intuitive sense because noise is hard to compress; hence, compressibility is one of the criteria that may help distinguish noise from the signal. The proposed codec is compared against established compression and denoising benchmarks, and the experiments reveal considerable bitrate savings compared to a cascade combination of a state-of-the-art codec and a state-of-the-art denoiser.
- Research Article
7
- 10.1007/s10851-006-9000-x
- Aug 14, 2006
- Journal of Mathematical Imaging and Vision
In image denoising, many researchers have tried for several years to combine wavelet-like approaches and optimization methods (typically based on the total variation minimization). However, despite the well-known links between image denoising and image compression when solved with wavelet-like approaches, these hybrid image denoising methods have not found counterparts in image compression. This is the gap that this paper aims at filling. To do so, we provide a generalization of the standard image compression model. However, important numerical limitations still need to be addressed in order to make such models practical.
- Research Article
6
- 10.1016/j.cam.2010.10.009
- Oct 22, 2010
- Journal of Computational and Applied Mathematics
Construction of parameterizations of masks for tight wavelet frames with two symmetric/antisymmetric generators and applications in image compression and denoising
- Research Article
127
- 10.14569/ijacsa.2011.020309
- Jan 1, 2011
- International Journal of Advanced Computer Science and Applications
This paper proposes different approaches of wavelet based image denoising methods. The search for efficient image denoising methods is still a valid challenge at the crossing of functional analysis and statistics. In spite of the sophistication of the recently proposed methods, most algorithms have not yet attained a desirable level of applicability. Wavelet algorithms are useful tool for signal processing such as image compression and denoising. Multi wavelets can be considered as an extension of scalar wavelets. The main aim is to modify the wavelet coefficients in the new basis, the noise can be removed from the data. In this paper, we extend the existing technique and providing a comprehensive evaluation of the proposed method. Results based on different noise, such as Gaussian, Poisson’s, Salt and Pepper, and Speckle performed in this paper. A signal to noise ratio as a measure of the quality of denoising was preferred.
- Research Article
12
- 10.1007/s11760-014-0640-9
- May 8, 2014
- Signal, Image and Video Processing
B-splines caught interest of many engineering applications due to their merits of being flexible and provide a large degree of differentiability and cost/quality trade-off relationship. However, they have less impact with continuous-time applications as they are constructed from piecewise polynomials. On the other hand, exponential spline polynomials (E-splines) represent the best smooth transition between continuous and discrete domains as they are made of exponential segments. In this paper, we present a complete analysis for an E-spline-based subband coding (wavelet) perfect reconstruction (PR) system. Derivations for the scaling and wavelet functions are presented, along with application of the proposed system in image compression and image denoising. In image compression, a comparison of the proposed technique compared with the B-spline-based PR system as well as the basic wavelet subband system with the SPIHT image codec is presented. In image denoising, we report the enhancement achieved with the proposed E-spline-based denoising approach compared with B-spline-based denoising and another basic denoising technique. In both applications, E-splines show superior performance as will be illustrated.
- Research Article
4
- 10.21917/ijivp.2022.0402
- Nov 1, 2022
- ICTACT Journal on Image and Video Processing
Image compression has become an essential subfield in image processing for many generations. This should be an effective process with decreasing this amount about a file format through frames unless significantly lowering from an exceptional standard. Image quality endures outcome with image compression or image visibility experiences as leading with maximum noise rate increases. In order to be accurate whole, developers are using a technology called denoising, which increases image quality, decreases effects of noise, and restores compression to its original condition. Image denoising has been an effective process with manipulating image datasets with just one graphically premium quality image. Those who start to move ahead included paper besides analyzing this same development with produce denoising convolutional neural networks (DnCNNs) for incorporate advances in rather a classification model, machine learning, but rather maximum likelihood methods into other image denoising. Need to remain further unique, residual learning, along with batch normalization, ought to be utilized to speed up the training stage of evolution even while enhancing denoising effectiveness. To begin, images encrypted with block-based optimization techniques display blockages, which was among the particular majority perplexing artifacts throughout compressed images and video. Furthermore, the DnCNN substructure is used to handle a variety of different image denoising functions, including singular attribute extremely but rather JPEG appearance deblocking with possibly enforced successfully through exploiting computations.
- Book Chapter
- 10.1049/pbte094e_ch4
- Apr 14, 2021
Recent research efforts have explored methods to achieve the wavelet transform as the most significant tool in medical image enhancement and processing. The problems of image fusion, compression, edge detection, denoising, and contrast enhancement can be handled by discrete wavelet transform (DWT) in the Internet of medical things (IoMT) framework. In this chapter, we present the novel DWT with orthogonal and biorthogonal wavelets application. Multiple applications of the wavelet transform in medical images have been submitted. These applications demonstrate the successful impact of applying DWT. The DWT has the ability to enhance the medical image and remove noise. The DWT in image compression can separately reduce the computational complexity into high and low frequency. This process reduces the image data in order to be able to store or transmit data in an efficient form. There are some advantages in using fusion based on DWT during other traditional methods, for example, reduced features and energy compaction. In digital watermarking, DWT technique is used for embedding and extraction of watermark in the original image.
- Research Article
13
- 10.14419/ijet.v7i3.27.17972
- Aug 15, 2018
- International Journal of Engineering & Technology
This work gives a survey by comparing the different methods of image denoising with the help of wavelet transforms and Convolutional Neural Network. To get the better method for Image denoising, there is distinctive merging which have been used. The vital role of communication is transmitting visual information in the appearance of digital images, but on the receiver side we will get the image with corruption. Therefore, in practical analysis and facts, the powerful image denoising approach is still a legitimate undertaking. The algorithms which are very beneficial for processing the signal like compression of image and denoising the image is Wavelet transforms. To get a better quality image as output, denoising methods includes the maneuver of data of that image. The primary aim is wavelet coefficient modification inside the new basis, by that the noise within the image data can be eliminated. In this paper, we suggested different methods of image denoising from the corrupted images with the help of different noises like Gaussian and speckle noises. This paper implemented by using adaptive wavelet threshold( Sure Shrink, Block Shrink, Neigh Shrink and Bivariate Shrink) and Convolutional Neural Network(CNN) Model, the experimental consequences the comparative accuracy of our proposed work.
- Conference Article
52
- 10.1109/wocn.2013.6616235
- Jul 1, 2013
Removing noise from the original signal is still a challenging job for researchers. There have been several numbers of published algorithms and each target to remove noise from original signal. This paper presents a result of some significant work in the area of image denoising it means we explore denoising of images using several thresholding methods such as SureShrink, VisuShrink and BayesShrink. Here we put results of different approaches of wavelet based image denoising methods. To find best method for image denoising is still a valid challenge at the crossing of functional analysis and statistics. Here we extend the existing technique and providing a comprehensive evaluation of the proposed method. Here the results based on various types of noise, such as Gaussian, Poisson's, Salt and Pepper, and Speckle performed in this paper. SNR (signal to noise ratio) and mean square error (MSE) are as a measure of the quality of denoising was preferred. Wavelet algorithms are very useful tool for signal processing such as image compression and image denoising. The main aim is to show the result of wavelet coefficients in the new basis, the noise can be minimize or removed from the data.
- Conference Article
1
- 10.1109/icalip.2010.5685196
- Nov 1, 2010
Wavelet transform has been successfully used in many applications such as image compression, image denoising, and computer vision. Recently, the research of image denoising focused on developing some new method which can represent the edge of image more efficiently. Ridgelets is a new system of representations, which deals effectively with line singularities in 2-D. In this paper, a simple algorithm is derived for the ridgelet transform of the center pixel of a three by three image. Then by embedding it into a moving window pyramid, the ridgelet transform-based multiscale image denoising is obtained. The high directional sensitivity of the ridgelet transform makes the proposed method a good choice for edge preserving. Experiments show that the new algorithm is effective.
- Research Article
34
- 10.3390/electronics11030418
- Jan 29, 2022
- Electronics
Image denoising is an important low-level computer vision task, which aims to reconstruct a noise-free and high-quality image from a noisy image. With the development of deep learning, convolutional neural network (CNN) has been gradually applied and achieved great success in image denoising, image compression, image enhancement, etc. Recently, Transformer has been a hot technique, which is widely used to tackle computer vision tasks. However, few Transformer-based methods have been proposed for low-level vision tasks. In this paper, we proposed an image denoising network structure based on Transformer, which is named DenSformer. DenSformer consists of three modules, including a preprocessing module, a local-global feature extraction module, and a reconstruction module. Specifically, the local-global feature extraction module consists of several Sformer groups, each of which has several ETransformer layers and a convolution layer, together with a residual connection. These Sformer groups are densely skip-connected to fuse the feature of different layers, and they jointly capture the local and global information from the given noisy images. We conduct our model on comprehensive experiments. In synthetic noise removal, DenSformer outperforms other state-of-the-art methods by up to 0.06–0.28 dB in gray-scale images and 0.57–1.19 dB in color images. In real noise removal, DenSformer can achieve comparable performance, while the number of parameters can be reduced by up to 40%. Experimental results prove that our DenSformer achieves improvement compared to some state-of-the-art methods, both for the synthetic noise data and real noise data, in the objective and subjective evaluations.
- Research Article
1
- 10.53555/nnmhs.v4i9.603
- Sep 30, 2018
- Journal of Advance Research in Medical & Health Science (ISSN: 2208-2425)
There are calls for enhancing present healthcare sectors when it comes to handling huge data size of patients’ records. The huge files contain lots of duplicate copies. Therefore, the ideal of compression comes into play. Image data compression removes redundant copies (multiple unnecessary copies) that increase the storage space and transmission bandwidth. Image data compression is pivotal as it helps reduce image file size and speeds up file transmission rate over the internet through multiple wavelet analytics methods without loss in the transmitted medical image data. Therefore this report presents data compression implementation for healthcare systems using a proposed scheme of discrete wavelet transform (DWT), Fourier transform (FT) and Fast Fourier transform with capacity of compressing and recovering medical image data without data loss. Healthcare images such as those of human heart and brain need fast transmission for reliable and efficient result. Using DWT which has optimal reconstruction quality greatly improves compression. A representation of enabling innovations in communication technologies with big data for health monitoring is achievable through effective data compression techniques. Our experimental implementation shows that using Haar wavelet with parametric determination of MSE and PSNR solve our aims. Many imaging techniques were also deployed to further ascertain DWT method’s efficiency such as image compression and image de-noising. The proposed compression of medical image was excellent. It is essential to reduce the size of data sets by employing compression procedures to shrink storage space, reduce transmission rate, and limit massive energy usage in health monitoring systems. The motivation for this work was to implement compression method to modify traditional healthcare platform to lower file size, and reduce cost of operation. Image compression aims at reconstructing images from extensively lesser estimations than were already thought necessary in relations with non-zero coefficients. Rationally, fewer well-chosen interpretations is adequate to reproduce the new sample exactly as the source image. We look at DWT to implement our compression method.
- Research Article
- 10.1002/jcu.70187
- Jan 25, 2026
- Journal of clinical ultrasound : JCU
Image denoising is a crucial pre-processing technique in retinal optical coherence tomography image compression, but existing methods struggle with signal-dependent noise and do not consider hybridized low-contrast residual noise (HLRN), failing to gather information from images. Thus, the novel Smooth Gyrated Texel Quadrivium Network (SGTQN) is proposed to reduce noise and collect self-sufficient information. In the SGTQN, the novel Additive Ascombe Smooth Sifter converts Poisson noise into Gaussian noise using the Ascombe Transform and removes unwanted Gaussian noise and the HLRN by hybridized noise removal, thus effectively gathering useful information from the image. After denoising, existing segmentation methodologies neglect the retinal nerve deviation value, creating a poor self-explanatory image. Thus, a novel Improvised Gyrated Alexa Net incorporates the Standardized Gyrated Layer, which considers the deviation values, thus generating a self-explanatory segmented image. Furthermore, many existing compression methods failed to achieve a higher quality image due to their non-uniform compression. The Texel Quadrivium Convolutional Network modifies the pooling layer into a Texel Quadrivium Layer to perform uniform compression and uses adjuvant vector coordinates to generate a high-resolution compressed image. This proposed model provides high-quality image compression with reduced noise, with a high accuracy of 95% and a lower mean square error of 0.02.
- Conference Article
17
- 10.1117/12.2597828
- Aug 1, 2021
Learning-based approaches to image compression have demonstrated comparable, or even superior performance when compared to conventional approaches in terms of compression efficiency and visual quality. A typical approach in learning-based image compression is through autoencoders, which are architectures consisting of two main parts: a multi-layer neural network encoder and a dual decoder. The encoder maps the input image in the pixel domain to a compact representation in a latent space. Consequently, the decoder reconstructs the original image in the pixel domain from its latent representation, as accurately as possible. Traditionally, image processing algorithms, and in particular image denoising, are applied to the images in the pixel domain before compression, and eventually even after decompression. The combination of the denoising operation with the encoder might reduce the computational cost while achieving the same performance in accuracy. In this paper, the idea of fusing the image denoising operation with the encoder is examined. The results are evaluated both by simulating the human perspective through objective quality metrics, and by machine vision algorithms for the use case of face detection.
- Research Article
8
- 10.1038/s41598-025-89451-w
- Feb 24, 2025
- Scientific Reports
Image denoising is a critical problem in low-level computer vision, where the aim is to reconstruct a clean, noise-free image from a noisy input, such as a mammogram image. In recent years, deep learning, particularly convolutional neural networks (CNNs), has shown great success in various image processing tasks, including denoising, image compression, and enhancement. While CNN-based approaches dominate, Transformer models have recently gained popularity for computer vision tasks. However, there have been fewer applications of Transformer-based models to low-level vision problems like image denoising. In this study, a novel denoising network architecture called DeepTFormer is proposed, which leverages Transformer models for the task. The DeepTFormer architecture consists of three main components: a preprocessing module, a local-global feature extraction module, and a reconstruction module. The local-global feature extraction module is the core of DeepTFormer, comprising several groups of ITransformer layers. Each group includes a series of Transformer layers, convolutional layers, and residual connections. These groups are tightly coupled with residual connections, which allow the model to capture both local and global information from the noisy images effectively. The design of these groups ensures that the model can utilize both local features for fine details and global features for larger context, leading to more accurate denoising. To validate the performance of the DeepTFormer model, extensive experiments were conducted using both synthetic and real noise data. Objective and subjective evaluations demonstrated that DeepTFormer outperforms leading denoising methods. The model achieved impressive results, surpassing state-of-the-art techniques in terms of key metrics like PSNR, FSIM, EPI, and SSIM, with values of 0.41, 0.93, 0.96, and 0.94, respectively. These results demonstrate that DeepTFormer is a highly effective solution for image denoising, combining the power of Transformer architecture with convolutional layers to enhance both local and global feature extraction.