BackgroundIn the modern industrial process, data-driven-based soft sensor technology plays a key role in intelligent measurement. However, due to the strong dependence of the prediction accuracy on the training data, it can be challenging to apply this technology to particular industrial processes for which obtaining enough training samples is difficult. MethodsTo solve the problem of insufficient training data for soft sensor models, this paper proposes the Potential of Heat-diffusion for Affinity-based Trajectory Embedding and generative adversarial network based virtual sample generation (PGAN-VSG) method, which combines Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE) based feature extraction, virtual feature generation (VFG) techniques, and PHATE based generative adversarial network (PGAN) to generate virtual samples. The methodology of the work consists of several phases. First, the PHATE is deployed to embed the original data into a low-dimensional space to obtain the original feature set, which preserves the local and global structure of the original data. Second, a number of candidate features are randomly generated in the neighborhood space of each original feature, and the features located in the sparse feature region are selected as virtual features. Furthermore, the PGAN with L1 distance loss is used to map the virtual features to the corresponding high-dimensional space to obtain virtual samples. To validate the performance of the proposed PGAN-VSG method, a multilayer perceptron (MLP) model was employed as a soft sensor model to showcase the effectiveness of the proposed approach. This was achieved through both a numerical example and an industrial case study involving the penicillin fermentation process. Significant findingsBy filling the sample scarcity region while ensuring the distribution of generated samples, the virtual sample generation (VSG) method proposed in this paper improves the prediction accuracy of the soft sensor model while also outperforming the classical VSG method.