Hydrogen (H2) solubility is a crucial parameter for industrial processes. This study utilises various Machine Learning (ML) techniques, including Decision Tree (DT), Multilayer Perceptron (MLP), Random Forest (RF), and Extremely Randomized Trees (ET), to predict H2 solubility across diverse chemical classes, encompassing n-alkanes, aliphatic and aromatic hydrocarbons, alcohols, aldehydes, diols, esters, ethers, ketones, and halides. An extensive database of 4337 samples of H2 solubility in 100 different chemicals was compiled from open literature. Problematic samples were identified through preprocessing. Three input sets were utilised for model development to determine the optimal combination of regressors. It was found that including dimensionless pressure and temperature significantly improved modelling accuracy. Higher accuracy was achieved by employing polynomial extracted features. 76 chemicals trained the model, and 20 tested it, with none overlapping. High accuracy on the test set shows applicability to a diverse chemical range beyond the study's dataset. The study's modelling results indicate RF and ET ensemble methods outperform DT and MLP in accuracy and stability, with ET being the most accurate. This study improves our understanding of H2 solubility in various chemicals, crucial for industrial processes like designing H2-based renewable energy plants. Our study advances ML in predicting chemical solubility by introducing advanced feature extraction and testing models with unseen chemicals. This not only validates our models' predictive capability but also yields deeper insights into H2 solubility. With standardized data preprocessing and a model adept at handling diverse chemicals, we demonstrate a substantial advancement in the field through a meticulous testing methodology.