In this study, three machine learning models, namely, support vector machine, naïve Bayes, and random forest, along with the geographic information system, are used to delineate the tar mat occurrence hazard in the upper part of Zubair Formation (named as DJ unit) in Rumaila oil field, southern Iraq. To build these models, a well inventory map that consists of 213 wells (tar and free-tar) is used with five factors that control the mat distribution in the DJ reservoir unit. These factors are porosity, volume of shale, water saturation, DJ unit thickness, and distance to the major fold axis. Multicollinearity and feature selection tests demonstrate that three of the five factors (distance to the major fold axis, water saturation, and porosity) are responsible for the tar distribution in the reservoir unit. The important factors with well inventory are used to develop the machine models and the results are compared using five evaluation measures namely, accuracy, sensitivity, specificity, kappa, and the relative operating characteristics curve. Findings from the models suggest that random forest is the best model, followed by support vector machine with polynomial kernel. The prediction random forest model is used to delineate the mat distribution, and the probability of tar mat occurrence hazard is classified into five zones, namely, very low, low, moderate, high, and very high. The produced hazard map in this study can be used as guide in the drilling of new wells in the considered reservoir unit and in avoiding the drilling of wells where the tar expected to occur.
Read full abstract