Variational auto-encoders (VAEs) have been widely used in soft sensing due to their ability to provide a probabilistic description of the hidden space. However, VAEs are static models that do not consider process dynamics, which can limit the ability of VAEs to accurately model complex industrial processes. To tackle this problem, this paper proposes a model called adaptive multi-head self-attention based supervised VAE (AMSA-SVAE). In AMSA-SVAE, an adaptive multi-head self-attention mechanism (AMSA) is proposed based on the multi-head self-attention mechanism (MSA). AMSA can dynamically extract different attention information depending on specific tasks. By adjusting the attention weights based on the input sequence, AMSA allows for more accurate and efficient modeling of complex industrial processes. Then, AMSA is used as the encoder and decoder of SVAE for soft sensing. Furthermore, with the data generation capabilities of VAE, an adaptive multi-head self-attention based VAE (AMSA-VAE) framework is proposed to address the issue of missing data. The AMSA-VAE is used to dynamically fill in missing data, thereby extending the capabilities of AMSA-SVAE. Finally, the performance of AMSA-SVAE is verified by a set of real industrial data, and the ability of AMSA-VAE framework is demonstrated by simulating different degrees of data missing rates. By combining the dynamic modeling capabilities of AMSA-SVAE with the data generation capabilities of AMSA-VAE, the proposed approach provides a robust solution to the challenges of incomplete data in soft sensing. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> — Soft sensors are widely used to measure key parameters in industrial processes, but missing values in the data are common due to sensor failures or transmission signal interference. This poses a significant challenge for traditional soft sensors, which require complete data to accurately model. Meanwhile, the dynamic nature of industrial process data further complicates the modeling process. To solve these challenges, this paper proposes an AMSA-SVAE model for soft sensing and an AMSA-VAE framework for filling in the missing values in the data, thereby extending the capabilities of AMSA-SVAE to handle missing data. When facing a dataset with missing values, AMSA-VAE framework is first used to fill in the missing values before the filled complete data is fed into AMSA-SVAE for modeling. Finally, the proposed approaches are evaluated through two sets of experiments using a real industrial dataset, showing the excellent performance of AMSA-SVAE and AMSA-VAE framework in modeling dynamic industrial process data and addressing the missing data problem.