In drilling operations, lost circulation (LC) poses a significant challenge, potentially leading to severe accidents such as overflow or blowouts if not accurately identified and controlled. This study introduces a method for identifying LC risk severity using data-driven machine learning and statistical algorithms, specifically demonstrated in the Ying-Qiong Basin of the South China Sea. The method begins with variance filtering to eliminate invalid data, followed by the application of the Pearson correlation coefficient to remove redundant data. This process results in the selection of 12 logging parameters out of the initial 38. Combined with indirect judgments of LC severity based on relevant prior knowledge, the data is labeled to construct the original dataset. A feature extraction and optimization method is proposed, which employs sliding time windows to extract 21 features. These features are then selected and optimized using a random forest algorithm to construct the optimal feature set. To enhance the controllability of the Extreme Learning Machine (ELM) model, the sparrow search algorithm is integrated. Additionally, several fitness functions are introduced for both regression and classification models, resulting in the development of the improved ELM (IELM) model. The dataset is split into training and test sets, and the performance of the IELM model is compared to that of the standard ELM and BP neural network models. The results indicate that the IELM model outperforms the other two, achieving an impressive test F1 score of 97.22 % for accurately identifying LC severity. Moreover, the model trained using the optimal feature set demonstrates higher accuracy and faster convergence. The proposed feature extraction and optimization method effectively increases the amount of information contained in the data, thereby enhancing the model's performance.
Read full abstract