Abstract

Lithology identification is an indispensable part in geological research and petroleum engineering study. In recent years, several mathematical approaches have been used to improve the accuracy of lithology classification. Based on our earlier work that assessed machine learning models on formation lithology classification, we optimize the boosting approaches to improve the classification ability of our boosting models with the data collected from the Daniudi gas field and Hangjinqi gas field. Three boosting models, namely, AdaBoost, Gradient Tree Boosting, and eXtreme Gradient Boosting, are evaluated with 5-fold cross validation. Regularization is applied to the Gradient Tree Boosting and eXtreme Gradient Boosting to avoid overfitting. After adapting the hyperparameter tuning approach on each boosting model to optimize the parameter set, we use stacking to combine the three optimized models to improve the classification accuracy. Results suggest that the optimized stacked boosting model has better performance concerning the evaluation matrix such as precision, recall, and f1 score compared with the single optimized boosting model. Confusion matrix also shows that the stacked model has better performance in distinguishing sandstone classes.

Highlights

  • Well log data contain rich geological information, which is a synthesized reflection of formation lithology and physical properties. erefore, geological interpretation of well log data and interpretation accuracy are crucial

  • Beyond the basic parameter set like the learning rate and the number of boosting stages, we investigate the impact of regularization parameters, namely, the fraction of samples to be applied for fitting the individual base learners and the number of features to be considered when determining the best split for individual base learners, on the performance of the model

  • Well logging data acquired from several wells from two gas fields in the Ordos Basin were used to train the boosting models for the multiclass lithology classification problem. e performance of three individual boosting models and their stacked model was evaluated through classification accuracy, precision, recall, f1 score, and confusion matrix

Read more

Summary

Introduction

Well log data contain rich geological information, which is a synthesized reflection of formation lithology and physical properties. erefore, geological interpretation of well log data and interpretation accuracy are crucial. Erefore, geological interpretation of well log data and interpretation accuracy are crucial. Well log data contain rich geological information, which is a synthesized reflection of formation lithology and physical properties. Regular interpretation methods such as statistical approach have low accuracy and slow efficiency. A reliable way to understand subterranean lithology is obtaining core samples or cuttings from the reservoirs. It is expensive and there is always some depth uncertainty [1]. Because of the development of different logging tools such as wireline and logging while drilling, we can collect numerous data in the petroleum industry [2]. Many researchers applied different algorithms on lithology identification and achieved great performance through updating and improving algorithms

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call