The porous underground structures have recently attracted researchers’ attention for hydrogen gas storage due to their high storage capacity. One of the challenges in storing hydrogen gas in aqueous solutions is estimating its solubility in water. In this study, after collecting experimental data from previous research and eliminating four outliers, nine machine learning methods were developed to estimate the solubility of hydrogen in water. To optimize the parameters used in model construction, a Bayesian optimization algorithm was employed. By examining error functions and plots, the LSBoost method with R² = 0.9997 and RMSE = 4.18E-03 was identified as the most accurate method. Additionally, artificial neural network, CatBoost, Extra trees, Gaussian process regression, bagged trees, regression trees, support vector machines, and linear regression methods had R² values of 0.9925, 0.9907, 0.9906, 0.9867, 0.9866, 0.9808, 0.9464, and 0.7682 and RMSE values of 2.13E-02, 2.43E-02, 2.44E-02, 2.83E-02, 2.85E-02, 3.40E-02, 5.68E-02, and 1.18E-01, respectively. Subsequently, residual error plots were generated, indicating the accurate performance of the LSBoost model across all ranges. The maximum residual error was − 0.0252, and only 4 data points were estimated with an error greater than ± 0.01. A kernel density estimation (KDE) plot for residual errors showed no specific bias in the models except for the linear regression model. To investigate the impact of temperature, pressure, and salinity parameters on the model outputs, the Pearson correlation coefficients for the LSBoost model were calculated, showing that pressure, temperature, and salinity had values of 0.8188, 0.1008, and − 0.5506, respectively, indicating that pressure had the strongest direct relationship, while salinity had an inverse relationship with hydrogen solubility. Considering the results of this research, the LSBoost method, alongside approaches like state equations, can be applied in real-world scenarios for underground hydrogen storage. The findings of this study can help in a better understanding of hydrogen solubility in aqueous solutions, aiding in the optimization of underground hydrogen storage systems.