Low-light image enhancement is a crucial visual task, and many unsupervised methods overlook the degradation of visible information in low-light scenes, adversely affecting the fusion of complementary information and hindering the generation of satisfactory results. To address this, we introduce Wakeup-Darkness, a multimodal enhancement framework that innovatively enriches user interaction through voice and textual commands. This approach signifies a technical leap and represents a paradigm shift in user engagement. We introduce a cross-modal feature fusion (CMFF) that synergizes semantic and depth context with low-light enhancement operations. Moreover, We propose a gated residual block (GRB) and a channel-aware look-up table (LUT) to adjust the intensity distribution of each channel. Crucially, the proposed Wakeup-Darkness scheme demonstrates remarkable generalization in unsupervised scenarios. The source code can be accessed from https://github.com/zhangbaijin/Wakeup-Dakness .
Read full abstract