Abstract
Underwater images suffer from different types of degradation, where color degradation occurs in the spatial domain and edge degradation in the frequency domain. The high-quality underwater image enhancement represents a crucial milestone in advancing computer vision systems tailored for marine environments. This foundational endeavor encompasses a wide array of applications in computer vision tasks, including underwater inspection, underwater archaeology, and environmental monitoring. However, current convolutional neural network (CNN)-based pyramid frameworks primarily focus on capturing local features, often overlooking the significance of global semantic features that play a crucial role in understanding underwater scenes. Moreover, these frameworks handle spatial and frequency features independently, failing to enhance images by exploring correlation among domain-specific attributes for enabling information consistency. Besides, optimizing the model using a loss function with the same domain attributes from the ground truth may not lead to a better generalization ability. To solve these problems, we propose a new Convolution-Transformer Blend Pyramid Network (CTPN), which consists of a spatial branch and several frequency branches. The CTPN has four key components: a Swin transformer encoder, a CNN-Transformer aggregated encoder–decoder (CTED) and a blend pyramid framework. The Swin transformer encoder is employed to capture global semantic features, benefiting from its ability to extract long-range and global dependencies among features. The CTED fuses local features captured by CNN layers and global semantic features captured by the Swin transformer encoder in the spatial branch, with the help of the Cross-Model Fusion Module (CFM) and Skip-Aggregation Module (SAM). Subsequently, a blend pyramid framework is designed which not only progressively expands the transformed information of the previous domain branch to the current domain branch via the CTED-based refining operation, but also utilizes the proposed Domain Affinity Block (DAB) to explore the connection between domain attributes, ensuring information consistency. The experimental results demonstrate that the proposed method outperforms existing underwater image enhancement methods quantitatively and qualitatively.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have