Abstract

Geophysical data is a form of spatial data that suffers from various limitations when applying conventional machine learning algorithms and evaluation techniques. A key limitation facing models trained on geophysical data is their inability to generalize well when deployed to predict from new unseen data. We address the problem of inaccurate performance assessments of machine learning models, that stems from violating independence assumptions during the feature selection and evaluation phases of the learning process. Our proposed spatially-aware and model-agnostic (SAMA) framework provides a suite of spatially-aware feature generation, feature selection, and model validation algorithms that account for spatial characteristics of geophysical data. The framework is model agnostic, as it tackles data-related challenges that are not affected by the specific machine learning algorithm used to fit the data. To demonstrate the effectiveness of the proposed approach, it is applied to the water saturation mapping problem using a novel geophysical dataset to train a prediction model. The proposed spatially-aware models obtains an <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$R^{2}$ </tex-math></inline-formula> of 0.620, an <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$RMSE$ </tex-math></inline-formula> of 0.220 for predicting water saturation for the Whole Region of the reservoir model box and an <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$R^{2}$ </tex-math></inline-formula> of 0.161, an <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$RMSE$ </tex-math></inline-formula> of 0.263 for the Interwell Region. Extensive experiments on 5 additional unseen datasets show that the model maintains stable performance across different datasets, which demonstrates the ability of the SAMA framework to produce robust models that are transferable to new datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call