Abstract

Abstract Oil field data are not always accurate and complete. According to Nobakht et al. (2009), corrupted and missing data costs the industry $60 billion annually. Unless extreme caution is taken to collect data, every dataset is expected to have at least 1-5% error from different sources i.e. human error, measurement setup error or measurement equipment malfunction. The main issue when dealing with missing and corrupted data is limited understanding of our data behavior. Better understanding means improved decisions will be made to remedy missing and corrupted data. Generally two methods are used to deal with the missing data: 1- Dropping missing intervals; 2- Estimation of expected missing point values. In this paper, both methods will be tested and applied on missing production and injection rates in waterfloods projects. A simple reservoir model was used to calculate the expected value for the missing values of the production rates, namely the Resistivity Model (RM) by Albertoni (2003). Reverse modelling was utilized to estimate the missing values for injection rates. Several cases were simulated with two missing ratios: low (15%) and high (30%) in both the production and injection data. Missing points were generated in the datasets in the form of four patterns (Arbitrary, Monotone, Multivariate and Modified Multivariate). The missing data locations were selected in a Monte Carlo-like manner and results were averaged from 400 realizations for each pattern. An error rate was obtained for each pattern and missing data from both injection and production rates were considered. The dropping method showed a larger error propagation rate when the modified multivariate mode is in effect. This pattern is expected to happen when there is a systematic accident or problem in the rate measurement devices. Thus, it takes time until the system can be returned to normal measurement. The least amount of error occurs when there is arbitrary missing data that could happen at any time. The recommendation here is that dropping is not an appropriate method if there is more than 3% missing data. The value of R2 approaches 60% quickly once the amount of missing data hits 2% for all patterns. This also helps explain the reasoning behind using only full data sets for a method such as RM. For a case with 15% missing data with data missing from both the injection and production, an R2 value of 0.87 was obtained using the method of imputation presented. This work found that missing more than 6% of the data is the limit for dropping data since the model R2 drops very fast when the amount of missing data is more than 6%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call