Light field depth estimation is crucial for various applications, but current algorithms often falter when dealing with complex textures and edges. To address this, we propose a light field depth estimation network based on multi-scale fusion and channel attention (LFMCNet). It incorporates a convolutional multi-scale fusion module to enhance feature extraction and utilizes a channel attention mechanism to refine depth map accuracy. Additionally, LFMCNet integrates the Transformer Feature Fusion Module (TFFM) and Channel Attention-Based Perspective Fusion (CAPF) module for improved occlusion refinement, effectively handling challenges in occluded regions. Testing on the 4D HCI and real-world datasets demonstrates that LFMCNet significantly reduces the Bad Pixel (BP) rate and Mean Square Error (MSE).
Read full abstract