Hydrothermal gasification is an import-ant way to utilize biomass resources, and accurate prediction of the biomass hydrothermal gasification process is of great significance for the formulation and optimization of process parameters and equipment. Data is a key factor in machine learning, but due to the high experimental costs associated with hydrothermal gasification processes, acquiring a large amount of experimental data for machine learning modeling is a major challenge. To address this issue, this study proposes a data generation and screening strategy based on generative adversarial networks (GAN). The data generation and screening strategy primarily rely on GAN networks trained with real data as data generators and random forest models trained with real data as data screeners. High-quality synthetic data is selected through screening criteria to augment the dataset. Four machine learning models are used to model the biomass hydrothermal gasification process based on synthetic data to validate this strategy. The results show that, compared to the original data, modeling with synthetic data leads to a significant increase in the evaluation metrics for predicting H2, CH4, and CO2, especially during the testing phase. This indicates the rationality of the proposed data generation and screening strategy.