Predicting the productivity of multistage fractured horizontal wells plays an important role in exploiting unconventional resources. In recent years, machine learning (ML) models have emerged as a new approach for such studies. However, the scarcity of sufficient real data for model training often leads to imprecise predictions, even though the models trained with real data better characterize geological and engineering features. To tackle this issue, we propose an ML model that can obtain reliable results even with a small amount of data samples. Our model integrates the synthetic minority oversampling technique (SMOTE) to expand the data volume, the support vector machine (SVM) for model training, and the particle swarm optimization (PSO) algorithm for optimizing hyperparameters. To enhance the model performance, we conduct feature fusion and dimensionality reduction. Additionally, we examine the influences of different sample sizes and ML models for training. The proposed model demonstrates higher prediction accuracy and generalization ability, achieving a predicted R2 value of up to 0.9 for the test set, compared to the traditional ML techniques with an R2 of 0.13. This model accurately predicts the production of fractured horizontal wells even with limited samples, supplying an efficient tool for optimizing the production of unconventional resources. Importantly, the model holds the potential applicability to address similar challenges in other fields constrained by scarce data samples.
Read full abstract