Abstract Soft sensing technology has found extensive application in predicting key quality variables in batch processes. However, its application in batch process is limited by the uneven batch length, the correlation of data and the difficulty in extracting the dependencies between variables and within variables. To address these issues, we propose a data-stacking multiscale adaptive graph neural network (DSMAGNN) soft sensor model. Firstly, Mutual information (MI) is used to selected quality-related variables, the 3D batch data is converted into a time-delay sequence suitable for input to the soft sensor model through the data stacking strategy, and the underlying time correlation at different time scales is preserved by incorporating the multi-scale pyramid network. Secondly, the dependencies between variables are inferred by the adaptive graph learning module, while the dependencies both within and between variables are modeled by the multi-scale temporal graph neural network. Thirdly, collaborative work across different time scales is further facilitated by the scale fusion module. Finally, the feasibility and effectiveness of the model are verified through experiments in the industrial-scale penicillin fermentation process and hot rolling process. (DSMAGNN) soft sensor model. Firstly, MI is used to selected quality-related variables, and the data stacking strategy is employed to convert 3D batch data into a time-delay sequence suitable for input to the soft sensor model. A multiscale pyramid network is utilized to preserve the underlying time dependence across different time scales. Secondly, since the dependence relationship between variables may vary across different time scales, the adaptive graph learning modules deduce scale-specific variable dependencies without relying on predefined priors. Thirdly, given the multiscale feature representation and the inferred dependency relationships between scale-specific variables, we introduce a multiscale temporal graph neural network to jointly model dependencies within and between variables. Building upon this foundation, the scale fusion module is deployed to facilitate collaboration across different temporal scales and automatically capture the significance of contributing temporal patterns. We validate the feasibility and effectiveness of the proposed model through experiments conducted in industrial-scale penicillin fermentation and hot continuous rolling processes, the results show that the proposed model is superior to several compared models, thus confirming its validity.