Summary Predicting oilfield performance is extremely challenging because of the large number of variables that can influence and control it. Traditional methods such as decline-curve analysis have been commonly used but have been shown to have significant shortcomings. In recent years, advances in machine learning (ML) have provided a new suite of tools to tackle complex multivariant problems such as understanding oil-reservoir performance and predicating the final recovery factor. In this study, the application of a random-forest algorithm to train three predictive models and investigate the influence of the various input variables was investigated. To train the algorithm, a database was built that includes information on 32 variables from 93 reservoirs from the Norwegian Continental Shelf. These variables control or potentially influence field performance and include factors that are a function of geology, subsurface conditions, fluids, and the engineering decisions taken in field development. In addition to these controlling parameters, data were also recorded for the fields that record performance. These included information on the estimated recovery factor and production rates. Eighty percent of the data were input into the random-forest algorithm to train the models, whereas 20% were retained to blind test the subsequent models. Model accuracy was measured by comparing actual and predicted observations for each prediction metric using an R2 score, mean square error, and root mean square error. The production-rate model had a mean square error of 0.004, whereas the mean square error for recovery factor was 0.024. Estimates of average monthly depletion rate have a mean square error of 0.0104. Predictor importance estimates indicate that geology/depth-dependent variables such as stratigraphic heterogeneity, reservoir depth of burial, average porosity, and diagenetic impact are among the variables with high importance in predicting recovery factor. When predicting reservoir-oil rate, the most important variables are related to field size, such as cumulative oil produced, number of wells, oil in place (OIP), and bulk rock volume. In this study, we provide data-driven insight into understanding the relationship between subsurface and engineering conditions of reservoir producibility; we also provide a tool for predicating reservoir performance within a basin or region.
Read full abstract