Seismic fluids prediction under the machine-learning framework is of great significance for the exploration and development of oil and gas resources, geothermal energy exploitation, carbon dioxide sequestration monitoring, and groundwater management. Data-driven supervised machine-learning algorithms often rely heavily on the characteristics of the data (number of labels and data distribution). The disparity in the number of different labels for the majority and minority samples can hinder the generalization ability of the machine-learning model, especially weakening the predicting power for minority groups (e.g., hydrocarbon-bearing rocks) which are often of essential interest to us. For a clastic reservoir exhibiting a typical class imbalance (the ratio of gas sandstone to other lithofluids is significantly low), under the framework of a supervised convolutional neural network, we investigate and compare various class-rebalance methods to enhance the model’s prediction ability for gas-bearing sandstones. To achieve the purpose of class rebalance, we mainly use sampling methods to obtain class-balanced data sets and cost-sensitive learning methods to modify loss functions. The crosswell blind tests indicate that the ensemble-based undersampling method of BalanceCascade is found to be most effective in enhancing the prediction performance, increasing the F1 score of gas sandstone by as much as 15%. We also propose the combination of Balance Cascade and focal-loss (FL) methods, which can further improve the F1 score of gas-bearing sandstone in several wells compared with using BalanceCascade or FL alone. By incorporating class-rebalance strategies into model building, we finally obtain more reliable seismic prediction results for gas-bearing sandstone.
Read full abstract