Neural decoding, which aims to predict external visual stimuli information from evoked brain activities, plays an important role in understanding human visual system. Many existing methods are based on linear models, and most of them only focus on either the brain activity pattern classification or visual stimuli identification. Accurate reconstruction of the perceived images from the measured human brain activities still remains challenging. In this paper, we propose a novel deep generative multiview model for the accurate visual image reconstruction from the human brain activities measured by functional magnetic resonance imaging (fMRI). Specifically, we model the statistical relationships between the two views (i.e., the visual stimuli and the evoked fMRI) by using two view-specific generators with a shared latent space. On the one hand, we adopt a deep neural network architecture for visual image generation, which mimics the stages of human visual processing. On the other hand, we design a sparse Bayesian linear model for fMRI activity generation, which can effectively capture voxel correlations, suppress data noise, and avoid overfitting. Furthermore, we devise an efficient mean-field variational inference method to train the proposed model. The proposed method can accurately reconstruct visual images via Bayesian inference. In particular, we exploit a posterior regularization technique in the Bayesian inference to regularize the model posterior. The quantitative and qualitative evaluations conducted on multiple fMRI data sets demonstrate the proposed method can reconstruct visual images more accurately than the state of the art.