As a result of technological advancements, reliable calculation of hydrogen (H2) solubility in diverse hydrocarbons is now required for the design and efficient operation of processes in chemical and petroleum processing facilities. The accuracy of equations of state (EOSs) in estimating H2 solubility is restricted, particularly in high-pressure or/and high-temperature conditions, which could result in energy loss and/or potential safety and environmental problem. Two strong machine learning techniques for building advanced correlation were used to evaluate H2 solubility in hydrocarbons in this study which were Group method of data handling (GMDH) and genetic programming (GP). For that purpose, 1332 datasets from experimental results of H2 solubility in 32 distinct hydrocarbons were collected from 68 various systems throughout a wide range of operating temperatures from 98 K to 701 K and pressures from 0.101325 MPa to 78.45 MPa. Hydrocarbons from two distinct classes include alkane, alkene, cycloalkane, aromatic, polycyclic aromatic, and terpene. Hydrocarbons have a molecular mass range of 28.054–647.2 g/mol, which corresponds to a carbon number of 2–46. Solvent molecular weight, critical pressure, and critical temperature, as well as pressure and temperature operational parameters, were used to create the features. With a regression coefficient (R2) which was equal to 0.986 and root mean square error (RMSE) which was 0.0132, the GP modeling approach estimated experimental solubility data more accurately than the GMDH approach. Operating pressure, followed by molecular weight of hydrocarbon solvents and temperature, had the greatest influence on estimation H2 solubility, according to sensitivity analysis. The GP model shown in this paper is a reliable development that may be used in the chemical and petroleum sectors as a reliable and effective estimator for H2 solubility in diverse hydrocarbons.