Risk assessment studies of potential CO2 sequestration projects consider many factors, including the possibility of brine and/or CO2 leakage from the storage reservoir. Detailed multiphase reactive transport simulations have been developed to predict the impact of such leaks on shallow groundwater quality; however, these simulations are computationally expensive and thus difficult to directly embed in a probabilistic risk assessment analysis. Here we present a process for developing computationally fast reduced-order models which emulate key features of the more detailed reactive transport simulations. A large ensemble of simulations that take into account uncertainty in aquifer characteristics and CO2/brine leakage scenarios were performed. Twelve simulation outputs of interest were used to develop response surfaces (RSs) using a MARS (multivariate adaptive regression splines) algorithm (Milborrow, 2015).A key part of this study is to compare different measures of ROM accuracy. We show that for some computed outputs, MARS performs very well in matching the simulation data. The capability of the RS to predict simulation outputs for parameter combinations not used in RS development was tested using cross-validation. Again, for some outputs, these results were quite good. For other outputs, however, the method performs relatively poorly. Performance was best for predicting the volume of depressed-pH-plumes, and was relatively poor for predicting organic and trace metal plume volumes. We believe several factors, including the non-linearity of the problem, complexity of the geochemistry, and granularity in the simulation results, contribute to this varied performance.The reduced order models were developed principally to be used in probabilistic performance analysis where a large range of scenarios are considered and ensemble performance is calculated. We demonstrate that they effectively predict the ensemble behavior. However, the performance of the RSs is much less accurate when used to predict time-varying outputs from a single simulation. If an analysis requires only a small number of scenarios to be investigated, computationally expensive physics-based simulations would likely provide more reliable results. If the aggregate behavior of a large number of realizations is the focus, as will be the case in probabilistic quantitative risk assessment, the methodology presented here is relatively robust.