To improve the oil recovery in the middle and late stages of oilfield development, it is very necessary to study the subsurface reservoir architecture. However, how to effectively improve the efficiency of architecture identification is also a big problem for the logging big data in dense well pattern area. In this study, we apply support vector machine (SVM), a supervised machine learning algorithm, and principal component analysis (PCA) data dimension reduction to three types of logging data (spontaneous potential (SP), gamma ray (GR), and acoustic (AC)) in order to automate the identification of braided river architectural elements on the PI2 braided river sand group of the Lamadian Oilfield in the north of Daqing Placanticline in the Songliao Basin, China. Manual qualitative identification shows that the architectural elements of the second sand group of the first member of Putaohua braided river reservoir (PI2) in the Lamadian area consist of mid-channel bars, braided channels, and floodplains. The presence of three types of sediments (medium-to fine-grain sand, fine-grain silty sand, and mudstone) indicates that changes occur in the water body energy within the PI2 braided river. The SVM algorithm development process includes determining the input and output variables, selecting the kernel functions, optimizing the parameters, training the algorithm with known samples, and testing the algorithm. With the twelve feature parameters (median, relative barycenter, variance, and the root of variational variance of SP, GR and AC data, respectively) as input variables and the braided river architecture categories as output variables, we use Gaussian radial basis function (RBF) and grid search to determine the optimal parameters values for penalty factor C and parameter gamma (γ). Without dimension reduction, the identification accuracy of the algorithm is 85.15%. To eliminate data redundancy, we used PCA to reduce the input data dimensionality. With dimension reduction, the identification accuracy of the algorithm increased by 6.93 percentage points to 92.08%. Because the logging response features are not distinctly different in the transition zone of mid-channel bar and braided channel, it is common for these two architectures to be misclassified. Even with this persistent misclassification, the error of our SVM algorithm identification process is only 7.92%, this method is robust enough to provide accurate results when applied to specific repetitive and complex geological big data problems.
Read full abstract