Going beyond traditional 2D imaging is not only an emerging trend of imaging technology, but also the key to a more immersive user experience. Light Field Image (LFI) is a typical high-dimensional imaging format, and the quality evaluation of which is very challenging but necessary. In this paper, we propose a novel Pseudo Video-based Blind quality assessment metric for Light Field image (PVBLiF). In contrast to most previous Light Field Image Quality Assessment (LF-IQA) metrics, in which different types of 2D representations derived from LFI are used for quality assessment indirectly, our metric exploits a more intuitive 3D representation, named Pseudo Video Block Sequence (PVBS), to evaluate the perceptual quality of LFI. For this purpose, we first divide the LFI into a massive number of non-overlapping PVBSs, which simultaneously contain spatial and angular information of LFI. Then, we propose a novel network (named PVBSNet) based on Convolutional Neural Networks (CNNs) to extract the spatio-angular features of PVBS and further evaluate the PVBS quality. The proposed PVBSNet consists of four stages: multi-information division, intra-feature extraction, cross-feature fusion, and quality regression. Finally, a Saliency- and Variance-guided Pooling (SVPooling) method is presented to integrate all the PVBS quality into the overall quality of LFI. The proposed PVBLiF metric has been extensively evaluated on three widely-used LFI datasets: Win5-LID, NBU-LF1.0, and SHU. Experimental results demonstrate that our proposed PVBLiF metric outperforms state-of-the-art metrics and is capable of highly approximating the performance of human observers. The source code of PVBLiF is publicly available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/ZhengyuZhang96/PVBLiF</uri> .