Abstract
Distinguished Author Series articles are general, descriptive representations that summarize the state of the art in an area of technology by describing recent developments for readers who are not specialists in the topics discussed. Written by individuals recognized to be experts in the area, these articles provide key references to more definitive work and present specific details only to illustrate the technology. Purpose: to inform the general readership of recent advances in various areas of petroleum engineering. Introduction A surprisingly wide range of issues affects the quantification of the relationship between different data types. Understanding these issues is important to our industry—and to most other technical and financial activities—because many of us now create and use these statistical relationships daily. The main line-fitting method, ordinary least-squares regression, has been known for more than a century, but alternative methods continue to be developed. The complexity of oilfield applications results in an ongoing learning curve. The subject is mathematical and sometimes complex and, therefore, causes day-to-day misunderstandings and pitfalls. Better evaluations will result from avoiding the weaknesses and using the strengths of the data. To allow focusing on the principles, this article describes linear models associating only two data types: A and B. When their relationship is more complex, the relevant nonlinear model, such as a power or exponential, is used. In many cases, both variables are accurately measured and accurately paired with each other. Figs. 1a and 2a show real examples wherein known oilfield properties are plotted vs. known depths. The data scatter in these graphs is not caused by errors in the data but by controlling factors that are not included in the simple two-variable model. Because of the complex nature of oilfield engineering and rock and fluid properties, significant scatter frequently is seen when oilfield variables are plotted together. In other cases, the measurements A and B could be the best available but may not be measured accurately. Their measurement errors cause further scatter of the plotted data points. We benefit from creating a mathematical model that allows one data type, A, to provide an estimate of another type, B, or vice versa. The model equation that makes the best prediction of A from B is not the same as the model equation that best predicts B from A. Users of model equations must make the choice of which one is required for the particular application. Prediction models can be created when measured data types are associated, whether or not the data are accurate. When these models are applied, they allow us to make predictions on the basis of the same type of uncertain measured data. The basic technique does not compensate for errors in the data; it creates central values that describe the scatter. A different statistical technique—structural modeling—is needed to create a line fit that compensates for the effect of measurement errors. Purposes of Modeling Model associations between two variables may be created to achieve different purposes, so the model's purpose must be carefully considered. A choice is required.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have