Stratigraphic identification from wire-line logs and core samples is a common method for lithology classification. This traditional approach is considered superior, despite its significant financial cost. Artificial neural networks and machine learning offer alternative, cost-effective means for automated data interpretation, allowing geoscientists to extract insights from data. At the same time, supervised and semi-supervised learning techniques are commonly employed, requiring a sufficient amount of labeled data to be generated through manual interpretation. Typically, there are abundant unlabeled geophysical data while labeled data are scarcer. Supervised and semi-supervised techniques partially address the cost issue. An underutilized class of machine-learning-based methods, unsupervised data clustering, can perform consonant classification by grouping similar data without requiring known results, presenting an even more cost-effective solution. In this study, we examine a state-of-the-art unsupervised data clustering algorithm called piecemeal clustering to identify lithofacies from wire-line logs, effectively addressing these challenges. The piecemeal clustering algorithm groups similar wire-log signatures into clusters, determines the number of clusters present in the data, and assigns each signature to one of the clusters, each of which represents a lithofacies. To evaluate the performance, we tested the algorithm on publicly released data from ten wells drilled in the Hugoton and Panoma fields of southwest Kansas and northwest Oklahoma, respectively. The data consist of two major groups: marine and non-marine facies. The study herein is centered around addressing two fundamental research questions regarding the accuracy and practicality of the piecemeal clustering algorithm. The algorithm successfully identified nine distinct clusters in our dataset, aligning with the cluster count observed in previously published works employing the same data. Regarding mapping accuracy, the results were notable, with success rates of 81.90% and 45.20% with and without considering adjacent facies, respectively. Further detailed analysis of the results was conducted for individual types of facies and independently for each well. These findings suggest the algorithm’s precision in characterizing the geological formations. To assess its performance, a comprehensive comparative analysis was conducted, encompassing other data clustering algorithms, as well as supervised and semi-supervised machine learning techniques. Notably, the piecemeal clustering algorithm outperformed alternative data clustering methods. Furthermore, despite its unsupervised nature, the algorithm demonstrated competitiveness by yielding results comparable to, or even surpassing, those obtained through supervised and semi-supervised techniques.
Read full abstract