The luminance dynamic range in natural scenes is exceedingly wide. However, capturing the wide dynamic range information of natural scenes through a single exposure is challenging for commercial light field cameras. One solution to expand the luminance dynamic range of captured light field images is to employ multi-exposure imaging techniques. Nevertheless, this approach may introduce various distortions during the image fusion procedure, especially for dynamic scenes. To evaluate the visual quality of multi-exposure fused light field images (MEFLFIs) and compare the performance of different multi-exposure light field fusion methods, we initially conduct a subjective quality assessment on MEFLFIs for dynamic scenes and establish the first MEFLFI benchmark dataset, which comprises 480 fused light field images generated by eleven state-of-the-art multi-exposure fusion algorithms and two tone-mapping algorithms, along with their corresponding subjective scores. Subsequently, a novel specially designed objective quality assessment metric for MEFLFIs is developed, which incorporates both local–global joint features and angular quality-aware features. The key contribution of this work lies in establishing a pioneering benchmark dataset for MEFLFIs along with a novel tailored objective quality metric. Experimental results conducted over our established MEFLFI database demonstrate that the proposed metric achieves superior performance, with Pearson linear correlation coefficient (PLCC) and Spearman rank correlation coefficient (SROCC) values of 0.8895 and 0.8658 respectively, representing a significant improvement of approximately 5.7 % and 4.9 % compared to the second-best specifically-designed LFI quality metric available. Furthermore, ablation experiments have confirmed that the integration of local–global joint features and angular features significantly enhances the performance of our objective metric, resulting in a better alignment with human visual perception.