This study uses a machine learning (ML) ensemble modeling approach to predict porosity from multiple seismic attributes in one of the most promising Main Dolomite hydrocarbon reservoirs in NW Poland. The presented workflow tests five different model types of varying complexity: K-nearest neighbors (KNN), random forests (RF), extreme gradient boosting (XGB), support vector machine (SVM), single layer neural network with multilayer perceptron (MLP). The selected models are additionally run with different configurations originating from the pre-processing stage, including Yeo–Johnson transformation (YJ) and principal component analysis (PCA). The race ANOVA method across resample data is used to tune the best hyperparameters for each model. The model candidates and the role of different pre-processors are evaluated based on standard ML metrics – coefficient of determination (R2), root mean squared error (RMSE), and mean absolute error (MAE). The model stacking is performed on five model candidates: two KNN, two XGB, and one SVM PCA with a marginal role. The results of the ensemble model showed superior accuracy over single learners, with all metrics (R2 0.890, RMSE 0.0252, MAE 0.168). It also turned out to be almost three times better than the neural net (NN) results obtained from commercial software on the same testing set (R2 0.318, RMSE 0.0628, MAE 0.0487). The spatial distribution of porosity from the ensemble model indicated areas of good reservoir properties that overlap with hydrocarbon production fields. This observation completes the evaluation of the ensemble technique results from model metrics. Overall, the proposed solution is a promising tool for better porosity prediction and understanding of heterogeneous carbonate reservoirs from multiple seismic attributes.
Read full abstract