Visual security index (VSI) represents a quantitative index for the visual security evaluation of perceptually encrypted images. Recently, the research on visual security of encrypted light field (LF) images faces two challenges. One is that the existing perceptually encrypted image databases are often too small, which is easy to cause overfitting in convolutional neural network (CNN). The other is that existing VSI models did not take a full account the intrinsic characteristics of the LF images and highly relied on handcrafted feature extraction. In this article, we construct a new database of perceptually encrypted LF images, called the PE-SLF, which is 2.6 times as big as the existing largest perceptual encrypted image database. Moreover, a novel visual security index (VSI) model is proposed by taking into full consideration the intrinsic spatial-angular characteristics of the LF images and the outstanding capabilities of CNN in feature extraction. First, we exploit CNN to detect the texture and structure features of encrypted sub-aperture images in the spatial domain. Second, we apply the Gabor filter to detect the Gabor feature over the epi-polar plane images in angular domain. Last, the spatial and angular similarity measurements are subsequently calculated for jointly yielding the final visual security score. Experimental results on the constructed PE-SLF demonstrate that the proposed VSI model is closer to the perception of HVS in visual security evaluation of encrypted LF images compared to other classical and state-of-the-art models.