Abstract

Abstract The use of statistical modeling and machine learning techniques to replace computationally expensive simulation models with an equivalent “proxy” or “surrogate” model is becoming commonplace in petroleum reservoir modeling. The traditional approach is to use a classical experimental design such as the Box-Behnken (BB) design, with results fitted to a quadratic response surface. An alternative is to use a Latin Hypercube sampling (LHS) based design in a Monte Carlo simulation framework, with results fitted using some advanced regression technique such as multidimensional kriging. A comparative understanding of the pros and cons of each of these methods does not appear to have been addressed in detail in the literature. This study seeks to evaluate each of these approaches (and variants thereof) for a common problem and discuss strategies for building surrogate models with robust predictive ability. The example problem studied here involves compositional simulation of supercritical CO2 into a deep saline formation consisting of a layered reservoir-caprock system with no-flow boundaries. The nine uncertain parameters of interest are: reservoir and caprock thickness/permeability/porosity, permeability anisotropy ratio, CO2 injection rate, and indicator for permeability layering. A 97-run BB design is chosen for the classical experimental design, and a 97-run Maximin (MM) LHS design is used for the Monte Carlo simulation. In addition, we use a 79-run Augmented Pairs (AP) design as an alternative to the BB design, and a 97-run Maximum Entropy (ME) design as an alternative to the MM design. CO2 injection is simulated for 30 years, with maximum extent of the CO2 plume extent, total storage efficiency and average pressure buildup in the reservoir taken to be the performance metrics of interest. Predictive models for these metrics are built using a quadratic polynomial model with a LASSO variable selection scheme for the BB and AP designs, and using multidimensional kriging for the MM and ME designs. In addition, the predictive ability of each design-model combination is examined using an independent data set and a k-fold cross-validation strategy. The latter involves splitting the data into a training set and a test set, building regression model on the training set and validating it with the test set. Repeated application of this procedure yields valuable information regarding the robustness of each regression modeling approach. For the problem of interest, we find that the MM design with a kriging or a quadratic model performs best from a cross-validation standpoint (not just from fitting the entire training data set, which could result in overfitting and biased predictions). We also examine the benefits of first applying screening of the uncertain parameters using a Plackett-Burman type design and then building a 13-run BB design only with the “heavy hitters”. Our results suggest that such a model does not have a robust predictive ability compared to the full 97-run BB design, especially under k-fold cross-validation. The main contributions of this paper are to provide insights on the comparative performance of experimental design and Monte Carlo type approaches (and variants thereof) for building surrogate models, and also to demonstrate the utility of cross-validation for improving model predictive ability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call