Video Enhancement Research Articles

InIn HTTP Adaptive Streaming (HAS), each video is divided into smaller segments, and each segment is encoded at multiple pre-defined bitrates to construct a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">bitrate ladder . To optimize bitrate ladders, per-title encoding approaches encode each segment at various bitrates and resolutions to determine the convex hull. From the convex hull, an optimized bitrate ladder is constructed, resulting in an increased Quality of Experience (QoE) for end-users. With the ever-increasing efficiency of deep learning-based video enhancement approaches, they are more and more employed at the client-side to increase the QoE, specifically when GPU capabilities are available. Therefore, scalable approaches are needed to support end-user devices with both CPU and GPU capabilities (denoted as CPU-only and GPU-available end-users, respectively) as a new dimension of a bitrate ladder. To address this need, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepStream , a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">scalable content-aware per-title encoding approach to support both CPU-only and GPU-available end-users. ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i ) To support <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">backward compatibility , <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepStream constructs a bitrate ladder based on any existing per-title encoding approach. Therefore, the video content will be provided for legacy end-user devices with CPU-only capabilities as a base layer (BL). ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ii ) For high-end end-user devices with GPU capabilities, an enhancement layer (EL) is added on top of the base layer comprising lightweight video super-resolution deep neural networks (DNNs) for each bitrate-resolution pair of the bitrate ladder. A content-aware video super-resolution approach leads to higher video quality, however, at the cost of bitrate overhead. To reduce the bitrate overhead for streaming content-aware video super-resolution DNNs, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepCABAC , context-adaptive binary arithmetic coding for DNN compression, is used. Furthermore, the similarity among ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i ) segments within a scene and ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ii ) frames within a segment are used to reduce the training costs of DNNs. Experimental results show bitrate savings of 34% and 36% to maintain the same PSNR and VMAF, respectively, for GPU-available end-users, while the CPU-only users get the desired video content as usual.

Image desmoking is a significant aspect of endoscopic image processing, effectively mitigating visual field obstructions without the need for additional surgical interventions. However, current smoke removal techniques tend to apply comprehensive video enhancement to all frames, encompassing both smoke-free and smoke-affected images, which not only escalates computational costs but also introduces potential noise during the enhancement of smoke-free images. In response to this challenge, this paper introduces an approach for classifying images that contain surgical smoke within endoscopic scenes. This classification method provides crucial target frame information for enhancing surgical smoke removal, improving the scientific robustness, and enhancing the real-time processing capabilities of image-based smoke removal method. The proposed endoscopic smoke image classification algorithm based on the improved Poolformer model, augments the model's capacity for endoscopic image feature extraction. This enhancement is achieved by transforming the Token Mixer within the encoder into a multi-branch structure akin to ConvNeXt, a pure convolutional neural network. Moreover, the conversion to a single-path topology during the prediction phase elevates processing speed. Experiments use the endoscopic dataset sourced from the Hamlyn Centre Laparoscopic/Endoscopic Video Dataset, augmented by Blender software rendering. The dataset comprises 3,800 training images and 1,200 test images, distributed in a 4:1 ratio of smoke-free to smoke-containing images. The outcomes affirm the superior performance of this paper's approach across multiple parameters. Comparative assessments against existing models, such as mobilenet_v3, efficientnet_b7, and ViT-B/16, substantiate that the proposed method excels in accuracy, sensitivity, and inference speed. Notably, when contrasted with the Poolformer_s12 network, the proposed method achieves a 2.3% enhancement in accuracy, an 8.2% boost in sensitivity, while incurring a mere 6.4 frames per second reduction in processing speed, maintaining 87 frames per second. The results authenticate the improved performance of the refined Poolformer model in endoscopic smoke image classification tasks. This advancement presents a lightweight yet effective solution for the automatic detection of smoke-containing images in endoscopy. This approach strikes a balance between the accuracy and real-time processing requirements of endoscopic image analysis, offering valuable insights for targeted desmoking process.

Video Enhancement Research Articles

Related Topics

Articles published on Video Enhancement

Paper] A Cache Decision Policy for QoE Enhancement of Video and Audio Transmission over ICN/CCN

Multi-Frame Quality Recovery Model for Compressed Video Enhancement

A real-time interactive restoration system for intraoral digital videos using segment anything model.

DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

ПОКРАЩЕННЯ ВІДЕОПОСЛІДОВНОСТІ В СИСТЕМАХ ВІДЕОАНАЛІТИКИ

Adaptive Locally-Aligned Transformer for low-light video enhancement

Learning Degradation-Robust Spatiotemporal Frequency-Transformer for Video Super-Resolution.

Blind quality-based pairwise ranking of contrast changed color images using deep networks

Wavelet energy-based adaptive retinex algorithm for low light mobile video enhancement

Endoscopic image classification algorithm based on Poolformer.

Spatio-Temporal Coherence of mmWave/THz Channel Characteristics and Their Forecasting Using Video Frame Prediction Techniques

Adaptive rule-based colour component weight assignment strategy for underwater video enhancement

ALIVE: Adaptive-Chromaticity for Interactive Low-light Image and Video Enhancement

Application of video image processing in sports action recognition based on particle swarm optimization algorithm

Low-Light Video Enhancement with Synthetic Event Guidance

Spatio-temporal propagation and reconstruction for low-light video enhancement

Non-contact human respiratory rate measurement under dark environments by low-light video enhancement

A Survey on Compression Domain Image and Video Data Processing and Analysis Techniques

OVQE: Omniscient Network for Compressed Video Quality Enhancement

Real-time Image Enhancement with Attention Aggregation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Video Enhancement Research Articles

Related Topics

Articles published on Video Enhancement

Paper] A Cache Decision Policy for QoE Enhancement of Video and Audio Transmission over ICN/CCN

Multi-Frame Quality Recovery Model for Compressed Video Enhancement

A real-time interactive restoration system for intraoral digital videos using segment anything model.

DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

ПОКРАЩЕННЯ ВІДЕОПОСЛІДОВНОСТІ В СИСТЕМАХ ВІДЕОАНАЛІТИКИ

Adaptive Locally-Aligned Transformer for low-light video enhancement

Learning Degradation-Robust Spatiotemporal Frequency-Transformer for Video Super-Resolution.

Blind quality-based pairwise ranking of contrast changed color images using deep networks

Wavelet energy-based adaptive retinex algorithm for low light mobile video enhancement

Endoscopic image classification algorithm based on Poolformer.

Spatio-Temporal Coherence of mmWave/THz Channel Characteristics and Their Forecasting Using Video Frame Prediction Techniques

Adaptive rule-based colour component weight assignment strategy for underwater video enhancement

ALIVE: Adaptive-Chromaticity for Interactive Low-light Image and Video Enhancement

Application of video image processing in sports action recognition based on particle swarm optimization algorithm

Low-Light Video Enhancement with Synthetic Event Guidance

Spatio-temporal propagation and reconstruction for low-light video enhancement

Non-contact human respiratory rate measurement under dark environments by low-light video enhancement

A Survey on Compression Domain Image and Video Data Processing and Analysis Techniques

OVQE: Omniscient Network for Compressed Video Quality Enhancement

Real-time Image Enhancement with Attention Aggregation