Abstract

Lithofacies is a key parameter in reservoir characterization. With advances in machine learning, many researchers have attempted to predict lithofacies from well-log curves by using a machine-learning algorithm. However, existing models are built purely on data, which do not provide interpretability. In addition, lithofacies distribution is highly imbalanced. We incorporate domain knowledge into a gated recurrent unit network to force the model to learn from the data and knowledge. The domain knowledge that we use is expressed as first-order logic rules and is incorporated into the machine-learning pipeline through additional loss terms. Specifically, these rules are: (1) if the density is smaller than or equal to [Formula: see text], then the lithofacies is coal; (2) if the density is larger than or equal to [Formula: see text] or the neutron porosity is smaller than or equal to [Formula: see text], then the lithofacies is anhydrite; and (3) if the gamma-ray value is larger than or equal to [Formula: see text], then the lithofacies is shale. Here, [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text] are the parameters that are learned by the model. By applying this domain knowledge, we aim to elucidate why the model predicts lithofacies as coal, anhydrite, or shale and reduce the effect of imbalanced data on the model’s performance. We evaluate the method on a data set from the North Sea, and the machine-learning pipeline with domain knowledge embedded is slightly superior compared with the baseline model that does not consider domain knowledge. One drawback of the method is that the domain knowledge that we provide only works for coal, anhydrite, and shale, which is incomplete. In future work, we will attempt to develop more rules that work for other types of lithofacies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call