Industrial data are often high-dimensional, nonlinear and multiple-modal. This paper develops a soft sensor model based on Gaussian mixture Variational Autoencoder (GMVAE) under the just-in-time learning (JITL) framework. To extract latent representations with multimode characteristics, GMVAE as a deep neural network model is utilized by considering Gaussian mixture models (GMM) in the latent space. After training the GMVAE model, each latent (or feature) variable can be described through a Gaussian mixture distribution. Subsequently, when a new sample arrives, a mixture symmetric Kullback-Leibler (MSKL) divergence is utilized to measure its similarity with historical data samples. MSKL divergence can measure similarity between two Gaussian mixture probability density functions. Based on the MSKL divergence, weighted input and output historical data are obtained, and then a local model is established. The effectiveness of the proposed soft sensor modeling method is validated through a numerical example along with simulation on the Tennessee Eastman benchmark process.