Abstract
This paper presents the work of predicting oil production using machine learning methods. As a machine learning method, a multiple linear regression algorithm with polynomial properties was implemented. Regression algorithms are suitable and workable methods for predicting oil production based on a data-driven approach. The synthetic dataset was obtained using the Buckley-Leverett mathematical model, which is used to calculate hydrodynamics and determine the saturation distribution in oil production problems. Various combinations of parameters of the oil production problem were chosen, where porosity, viscosity of the oil phase and absolute permeability of the rock were taken as input parameters for machine learning. And the value of the oil recovery factor was chosen as the output parameter. More than 400 thousand synthetic data were used to test multiple regression algorithms. To estimate the quality of regression algorithms, the mean square error metrics and the coefficient of determination were used. It was found that linear regression does not cover all patterns in the data due to underfitting. Various degrees of polynomial regression were deployed and tested, and it was also found that for our synthetic data, the quadratic polynomial model trains quite well and perfectly predicts the value of the oil recovery factor. To solve the overfitting problem, L1 regularization known as the Lasso regression method was applied. For the quadratic polynomial regression model, the coefficient of determination was 0.96, which is a pretty good result for the test data. Thus, it is assumed that the data-driven machine learning methods discussed in the paper can be useful for predicting the oil recovery factor using practical data from oil fields at the stages of production
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have