Gas injection (GI) pressure-volume-temperature (PVT) laboratory data play an important role in evaluating the performance and efficiency of enhanced oil recovery (EOR) processes, including carbon dioxide (CO2) injection. Although typically there is a large conventional PVT laboratory data set corresponding to hydrocarbon reservoirs, GI laboratory studies are relatively scarce. Performing EOR laboratory studies may be either unnecessary in the case of EOR screening, or unfeasible in the case when reservoir fluid composition at current conditions is different from initial conditions. Given that GI is to be widely evaluated as a potential EOR process and the critical emerging importance of CO2 storage, there is increased demand for time- and cost-effective solutions to predict the outcome of associated GI laboratory experiments.While machine learning (ML) is extensively used to predict black-oil (BO) properties, it is not the case for compositional reservoir properties, including those related to GI. The core challenge that is addressed in this paper consists on using the typically extensive conventional hydrocarbon fluid laboratory data to predict the needed GI fluid properties. We present an ML-based solution that predicts swelling test data from readily available fluid properties such as fluid composition and BO properties. That is, learning from samples with GI laboratory data and predicting GI fluid parameters for the much larger data set. We also extend the data by building a synthetic data set using calibrated equation-of-state (EoS) models for the large number of fluid samples with swelling data. The augmented data set consolidating both laboratory data and calibrated model results is used to improve the predictive capability of the ML model for a larger GI fraction space than that covered by the laboratory experiments. We present the algorithms and analyze the results obtained from applying these algorithms on a uniquely extensive corporatewide database.