Accurate prediction of porosity and permeability is crucial for understanding subsurface fluids. Traditional physical methodologies, however, are both costly and time-consuming. Moreover, existing machine learning predictive methods require a substantial number of samples, leading to performance bottlenecks. In response to the acute scarcity of core data and weak logging responses in actual working areas, we propose a novel machine learning algorithm based on the REaLTabFormer-voting in Extra Trees, XGBoost, and Random Forest (RTF-vEXR) model, for porosity and permeability prediction using well logging data. RTF-vEXR consists of two primary components: First, the REaLTabFormer data generation model is introduced, which captures the intrinsic correlations between logging parameters and target parameters (porosity and permeability), thereby enhancing the quality of core data. Second, we employ a vEXR-based ensemble regression model, which exhibits robustness and fitting ability, to achieve accurate prediction of porosity and permeability in tight sandstone reservoirs. In practice, we implement the proposed RTF-vEXR model in the Sulige Gas Field for porosity and permeability prediction. Extensive experimental results indicate that, compared to other baseline methods, our proposed RTF-vEXR model provides the best fit for core porosity and permeability, achieving the highest R-squared value, as well as the lowest mean absolute error and root mean square error, demonstrating the feasibility and effectiveness of the method for predicting porosity and permeability. Furthermore, ablation experiments indicate that integrating the RTF module can significantly enhance the models performance in dealing with low-data scenarios.
Read full abstract