Petrophysical evaluation of clastic formations in boreholes with incomplete well log dataset by using joint inversion technique and machine learning algorithms

  • Abstract
  • Literature Map
  • References
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

A succesful petrophysical evaluation of shaly-sand formations requieres: 1) the availability of high quality well log data and, 2) a petrophysical model that successfully represents the geological conditions of the rocks. Unfortunately, it is not always possible to fulfill these conditions, and in many cases the set of well logs is incomplete. To determine petrophysical parameters (i.e., volumes of laminar, structural and disperse shale) in clastic rocks from incomplete well log data we followed three approaches which are based on a hierarchical model, and on a joint inversion technique: 1) Available well log data (excluding the incomplete well log) are used to train machine learning algorithms to generate a predictive model; 2) the first step of the second approach machine learning algorithms are used to generate the missing data which are subsequently included a joint inversion; 3) in the third approach, machine learning process is used to estimate the missing data which are subsequently included in the prediction of the petrophysical properties. The supervised learning paradigm we used was in a joint based on different regression models (linear, decision trees, and kernel). A performance analysis of the three approaches is conducted with synthetic data (representing real conditions of clastic formations from an oil field in southern Mexico). We simulated gamma ray, deep resistivity, P-wave travel time, bulk density and neutron porosity logs by means of a hierarchical petrophysical model for clastic rock to accomplish a controlled analysis. The three different approaches were applied without P-wave travel time data to analyze the impact of the missing information. In general, the results indicate an adequate petrophysical parameter determination in each of the approaches. Metric evaluations indicate that the best performance was obtained by the second approach followed by approaches one and three. The correct estimation of the volumes of shale distribution could not be correctly resolved by any of the three applied methods but the total shale content could accurately be predicted which suggests that there is a non-uniqueness problem.

ReferencesShowing 10 of 47 papers
  • Cite Count Icon 10
  • 10.1088/1742-2132/8/4/009
Model of sand formations for joint simulation of elastic moduli and electrical conductivity
  • Nov 14, 2011
  • Journal of Geophysics and Engineering
  • A Aquino-López + 2 more

  • Cite Count Icon 12
  • 10.1016/j.jappgeo.2020.104238
Predicting the electrical conductivity of brine-saturated rocks using machine learning methods
  • Dec 8, 2020
  • Journal of Applied Geophysics
  • Tuan Nguyen-Sy + 4 more

  • Cite Count Icon 145
  • 10.1306/3d9343f4-16b1-11d7-8645000102c1865d
Resistivity of Brine-Saturated Sands in Relation to Pore Geometry
  • Jan 1, 1952
  • AAPG Bulletin
  • W O Winsauer (2), H M Shearin,

  • Open Access Icon
  • Cite Count Icon 5777
  • 10.7551/mitpress/3206.001.0001
Gaussian Processes for Machine Learning
  • Nov 23, 2005
  • Carl Edward Rasmussen + 1 more

  • Cite Count Icon 39
  • 10.2118/10546-pa
On the Relationship Between Formation Resistivity Factor and Porosity
  • Aug 1, 1982
  • Society of Petroleum Engineers Journal
  • Candelario Pérez-Rosales

  • Cite Count Icon 70
  • 10.1111/j.1365-2478.2005.00498.x
Elastic properties of double‐porosity rocks using the differential effective medium model
  • Aug 15, 2005
  • Geophysical Prospecting
  • M Markov + 3 more

  • Cite Count Icon 35
  • 10.1007/978-3-642-02332-3_4
Sandstones and Sandstone Reservoirs
  • Jan 1, 2010
  • Knut Bjørlykke + 1 more

  • Cite Count Icon 78
  • 10.1144/petgeo.8.3.217
Determination of facies from well logs using modular neural networks
  • Sep 1, 2002
  • Petroleum Geoscience
  • Alpana Bhatt + 1 more

  • Cite Count Icon 21
  • 10.1016/j.petrol.2006.09.008
Joint inversion of conventional well logs for evaluation of double-porosity carbonate formations
  • Nov 22, 2006
  • Journal of Petroleum Science and Engineering
  • Elena Kazatchenko + 3 more

  • Cite Count Icon 23
  • 10.1016/j.jappgeo.2015.02.013
Modeling and inversion of elastic wave velocities and electrical conductivity in clastic formations with structural and dispersed shales
  • Feb 11, 2015
  • Journal of Applied Geophysics
  • A Aquino-López + 3 more

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.geoen.2023.212057
Feature engineering process on well log data for machine learning-based SAGD performance prediction
  • Jun 25, 2023
  • Geoenergy Science and Engineering
  • Namhwa Kim + 2 more

Feature engineering process on well log data for machine learning-based SAGD performance prediction

  • Conference Article
  • Cite Count Icon 1
  • 10.2523/iptc-22081-ms
A Data Driven Machine Learning Approach to Predict the Nuclear Magnetic Resonance Porosity of the Carbonate Reservoir
  • Feb 21, 2022
  • Zeeshan Zeeshan Tariq + 3 more

Carbonate rocks have a very complex pore system due to the presence of interparticle and intra-particle porosities. This makes the acquisition and analysis of the petrophysical data, and the characterization of carbonate rocks a big challenge. Neutron porosity log and sonic porosity logs are usually considered as less accurate compared to the NMR porosity. Neutron-density porosity depends on parameters related to rock matrix which cause the inaccurate estimation of the porosity in special cases suchlike dolomitized and fractured zone. Whereas NMR porosity is based on the amount of hydrogen nuclei in the pore spaces and is independent of the rock minerals and is related to the pore spaces only. In this study, different machine learning algorithms are used to predict the Nuclear Magnetic Resonance (NMR) porosity. Conventional well logs such as Gamma ray, neutron porosity, deep and shallow resistivity logs, sonic traveltime, and photoelectric logs were used as an input parameter while NMR porosity log was set as an output parameter. More than 3500 data points were collected from several wells drilled in a giant carbonate reservoir of the middle eastern oil reservoir. Extensive data exploratory techniques were used to perform the data quality checks and remove the outliers and extreme values. Machine learning techniques such as random forest, deep neural networks, functional networks, and adaptive decision trees were explored and trained. The tuning of hyper parameters was performed using grid search and evolutionary algorithms approach. To optimize further the results of machine learning models, k-fold cross validation criterion was used. The evaluation of machine learning models was assessed by average absolute percentage error (AAPE), root mean square error (RMSE), and coefficient of correlation (R). The results showed that deep neural network performed better than the other investigated machine learning techniques based on lowest errors and highest R. The results showed that the proposed model predicted the NMR porosity with an accuracy of 94% when related to the actual values. In this study in addition to the development of optimized DNN model, an explicit empirical correlation is also extracted from the optimized model. The validation of the proposed model was performed by testing the model on other wells, the data of other wells were not used in the training. This work clearly shows that computer-based machine learning techniques can determine NMR porosity with a high precision and the developed correlation works extremely well in prediction mode.

  • Conference Article
  • Cite Count Icon 1
  • 10.30632/spwla-2022-0114
Multi-Level Reservoir Identification with Logs Based on Machine Learning
  • Jun 11, 2022
  • Gang Luo + 8 more

Machine learning algorithms have become powerful tools for modeling in the engineering field. They are suitable for solving problems that can't be effectively solved by traditional physical models or empirical models due to the complex relationship of variables. Since the traditional interpretation method of log data is based on petrophysical mechanisms and models, many assumptions are needed, which may lead to deviations in practical application. Therefore, it is of great significance to achieve reservoir fluid identification when using machine learning processing and interpreting log data. The existing reservoir identification methods have not thoroughly mined the internal relationships of log data. Moreover, the distribution of reservoir categories is seriously unbalanced. Reservoirs with similar physical properties are easily confused in identification. We propose an effective method of machine learning to solve the above problems. A long short-term memory network (LSTM) is used to characterize the time series characteristics of logs varying with depth domain. The kernel of the convolutional neural network (CNN) is used to slide on log curves to characterize their relationships. Considering the unbalanced distribution and the different development values of reservoirs categories, the weighted cross-entropy loss function is used to improve the weight of oil-bearing reservoirs with less distribution but higher development value when model training. According to the difference and similarity of reservoir physical properties, a multi-level reservoir identification process is designed: Level-I (reservoir and non-reservoir), Level-II (oil-bearing reservoirs, water-bearing reservoirs, and dry layer), and Level-III (oil layer, oil-water layer, poor oil layer, and water layer, oily-water layer). This method is verified on the log data of oil fields, in which the reservoir categories distribution is highly unbalanced. Moreover, the fraction of oil-bearing reservoirs is 9%, which agreement with the actual industrial situation. A series of comparative experiments proved that the parallel network structure of LSTM and CNN can fully examine the internal relationships and sequence characteristics of log curves. The weighted cross-entropy loss function significantly improves the fluid identification accuracy of oil-bearing reservoirs. Moreover, the multi-level reservoir identification method is more accurate in avoiding the identification confusion of reservoirs with similar physical properties. The experimental results demonstrate that this method is very practical and useful to help geological experts and engineers find reservoirs and complete evaluation.

  • Research Article
  • 10.52716/jprs.v15i1.869
Hyperparameter Optimization of Tree-Based Machine Learning (TB-ML) to Predict Permeability of a Heterogeneous Carbonate Oil Reservoir
  • Mar 21, 2025
  • Journal of Petroleum Research and Studies
  • Alqassim A Hasan + 4 more

Permeability is a crucial petrophysical attribute to be accurately estimated due to its direct influence on reservoir characterization, heterogeneity assessment, reservoir simulation, and the level of uncertainty in decision-making during field development planning. However, measuring permeability often involves expensive core analysis or well test analysis. It is typically not feasible to conduct such analysis across an entire reservoir involving cores from all wells. Therefore, there is a need to accurately model and predict permeability as a function of routinely obtained, lower cost, well logging data. Machine learning algorithms (ML) have been recently developed to reliably predict permeability by leveraging well logs data. In this research, an efficient tree-based (TB-ML) algorithm incorporating extreme gradient boosting (XGBoost) is employed to predict permeability in the Mishrif carbonate reservoir (Iraq) based on facies and well logging data. The recorded and interpreted well log variables used as input variables include gamma ray, caliper, density, neutron porosity, shallow and deep resistivity, total porosity, spontaneous potential, photoelectric factor, and water saturation. Additionally, core-derived permeability and porosity data is used to calibrate the ML predictions. The discrete reservoir facies are distinguished by applying a k-means clustering algorithm. Subsequently, the TB-ML algorithm is developed using the default and fine-tuned hyperparameters with the aid of two search algorithms: random search and Bayesian optimization. The permeability predictions are evaluated using cross-validation and error quantification metrics, which include the adjusted coefficient of determination (R2) and root mean squared error (RMSE). A comparison of adjusted-R2 and RMSE for the various TB-ML model configurations developed is compared for training and testing subsets to illustrate their permeability prediction performance. These results suggest that the method is sufficiently reliable to be generalized for application in both carbonate and clastic reservoirs in other oil and gas fields.

  • Conference Article
  • 10.2118/199776-stu
Randomness of Geophysical Log Data – Fractal Approach
  • Sep 23, 2019
  • Michal Figiel

Geophysical data allows for measuring a change in petrophysical parameters thought a whole well length. They often exhibit a chaotic behaviour which is difficult to describe and finding a pattern is near impossible. A potential measure of this chaos – correlation dimension – has been examined in the study. The research was carried out for the log data from Williston Basin, USA and the Norwegian Lille-Frigg oil field on the North Sea. Sonic log (DT), neutron porosity log (NPHI), deep resistivity log (LLD) as well as density log (RHOB) were utilised in the study. A python program has been written to measure the change in correlation dimension. Instead of calculating a one value of a correlation dimension for a whole log, a moving range algorithm was developed and implemented. It is based on defining a range for which the dimension is calculated and then moving the range on a geophysical log. In addition, a graph representing change of a correlation dimension with depth is drawn. The influence of data range and range shift were measured. Over 100 correlations have been carried out between rock properties and their dimension. The results indicate that the correlation dimensions change throughout the whole geophysical log and correlate with themselves and other curves in a moderate degree. It allows for determining ranges where a data set is not chaotic. The research shows that properly set range should have a reasonable and representative amount of data points, while the shift should be small for accurate results. Presented analysis creates perspectives for a more precise rock formation description and possible correlation between different oil wells within a single reservoir.

  • PDF Download Icon
  • Research Article
  • 10.4314/jasem.v23i12.16
Reservoir Characterization and Volumetric Analysis of “Lee” Field, Onshore Niger Delta, Using 3D Seismic and Well Log Data
  • Jan 29, 2020
  • Journal of Applied Sciences and Environmental Management
  • P Adigwe + 1 more

Three dimensional (3D) seismic data, and a suite of four geophysical well logs from four wells located on the Lee field, Niger Delta were analyzed using Petrel software for the aim of reservoir characterization and volumetric analysis of the field. The objectives among others include identification and delineation of the reservoirs and estimating the petrophysical parameters from the well logs available, generating time and depth structure maps of horizons from the seismic section, and a volumetric analysis in order to estimate hydrocarbon in place. The method adopted involves petrophysical analysis, structural analysis, static modelling, and volumetric analysis. Detailed petrophysical analysis revealed three reservoirs. Average Reservoir parameters such as effective porosity (0.17), gross thickness (86 m), hydrocarbon saturation (0.42), permeability (1215 mD) and net-to-gross (0.79) were derived from petrophysical analysis. The three reservoirs were classified using average results of petrophysical parameters. And based on these results, Reservoir 1 is the most prolific while Reservoir 3 is the least prolific within Lee field. Fault and Horizon interpretations were done using Petrel software which culminated in delivery of 3D structural map of the reservoirs. Structural,stratigraphic and Petrophysical models were developed and then integrated to produce a high resolution static model for Reservoir 1. The hydrocarbon in place shows that reservoir 1 is of appreciable thickness and areal extent. The volume of hydrocarbon originally in place was estimated to be 367,180,095.08 barrels of oil.Keywords: volumetric, petrophysical, fault, saturation, net-to-gross, permeability, horizon

  • Conference Article
  • 10.30632/spwla-2025-0050
A Monitoring and Analysis Platform for Real Time and Autonomous Formation Evaluation and Machine Learning Deployment of Multiple Concurrent Assets
  • May 17, 2025
  • Lautaro Rayo + 3 more

In upstream organizations, formation evaluation products are required by a multitude of subdisciplines in order to achieve their individual objectives. Moreover, some tasks require these products to be provided in real- time, such as geosteering, drilling management and formation sampling, while others require it in a time- sensitive fashion, such as completion and well testing design and material procurement. While simplified real- time procedural interpretation of logging-while-drilling (LWD) data has been implemented in numerous organizations, we believe this is the first production stage platform that automatizes model parameter selection of complex petrophysical models as targets are drilled, solving for the full non-linear system of equations as customary in modern formation evaluation practice. Additionally, our platform also deploys pre- trained Machine Learning models that augment data acquired prior to formation evaluation, or produces discipline-specific predictions post formation evaluation. The platform is served in a browser and provides a real- time formation evaluation for every wellbore where logging data is being acquired. The petrophysical models utilized by the platform are automatically allocated by a tree-based decision system that consumes data such as the wellbore field, the current geologic unit being drilled and facies identification based on the measured data. The petrophysical interpreter thus honors these models for newly incoming data streamed to a real-time database. Users with special privileges can at any time override the modeling decisions made by the platform and manually prescribe models that will be used over the data already acquired and over the data that is acquired from then on. This supervised-autonomy provides full interpretation flexibility while completely eliminating the need for the well log analyst to repeat any task, empowering specialists to seemingly monitor several wellbores at a time. In a similar fashion, the platform prescribes a number of Machine Learning models to be deployed over specific wellbores across specific geologic units based on a tree-based decision system. The platform maintainer keeps a pool of pre-trained indexed Machine Learning models available to the platform, covering a wide range of objectives, applications and disciplines, thus augmenting the value and usability of the streamed data. In one year, the platform produced real-time formation evaluation analysis for several hundreds of wellbores across multiple fields, serving between 20 to 30 concurrent wellbores at any given time. The platform is monitored by a much smaller group of specialists than would be required when operating over a traditional workflow. Subdisciplines depending on formation evaluation products have access to them 24/7 and in real- time through a web browser, eliminating unnecessary delays. Preliminary results indicate the platform could be responsible for improvements in geosteering outcomes and drilling efficiency. The authors believe this is the first production experience of sophisticated formation evaluation being automated and deployed in real-time over multiple concurrent assets. The integration of Machine Learning models into real-time formation evaluation products clears the path for substantial enhancements in geosteering accuracy and drilling efficiency.

  • Research Article
  • 10.1007/s13201-025-02547-6
Contribution of hydrogeological, well logs and machine learning in predicting the aquifer hydraulic properties in arid regions: a case study of Nubian Sandstone aquifer, Farafra Oasis, Egypt
  • Jul 5, 2025
  • Applied Water Science
  • Ahmed Nosair + 2 more

In hydrogeology, assessing key aquifer hydraulic parameters such as transmissivity (T), hydraulic conductivity (K), and porosity (PHIE) is crucial for effective groundwater management. Traditionally, these parameters are obtained through pumping tests and well log data. However, porosity logs are often lacking in most groundwater wells. While neutron density logs are commonly used for porosity estimation, our study uniquely employs resistivity logs to calculate porosity due to the scarcity of recorded logs in groundwater exploration. Consequently, this research aims to use conventional well log and hydrogeological data to predict T, K, and PHIE using machine learning (ML) algorithms, including random forest (RF), gradient boosting (GB), linear regression (LR), and support vector machines (SVM). This methodology is applied as a case study in the Nubian Sandstone Aquifer (NSA) in Farafra Oasis, Egypt. Firstly, T and k values were determined by analysis of the long duration pumping test records for ten wells penetrated the NSA. The performance of the ML algorithms in predicting transmissivity and hydraulic conductivity was rigorously evaluated using test wells. The RF model demonstrated superior accuracy, with predicted values of T and K being 113.11 m2/h and 0.2271 m/h in well W-6, and 104.15 m2/h and 0.1867 m/h in well W-8, respectively. The close agreement among actual and predicted values underscores the RF model’s reliability in estimating these parameters, effectively identifying the fundamental trends within the dataset. For porosity prediction, the RF and GB models exhibited excellent correlation with log-derived PHIE, achieving correlation coefficients of 0.95 and 0.96, respectively. In contrast, the LR model showed acceptable performance, while the SVM model had comparatively lower correlation. These findings highlight the potential of ML models, particularly RF and GB, in accurately predicting key aquifer hydraulic parameters, thereby enhancing the understanding and management of the groundwater aquifers.

  • Research Article
  • Cite Count Icon 55
  • 10.1016/j.petrol.2021.109681
A novel hybrid method of lithology identification based on k-means++ algorithm and fuzzy decision tree
  • Jan 1, 2022
  • Journal of Petroleum Science and Engineering
  • Quan Ren + 5 more

A novel hybrid method of lithology identification based on k-means++ algorithm and fuzzy decision tree

  • Conference Article
  • Cite Count Icon 1
  • 10.2118/69482-ms
Geological Constrained Log Filtering as a Basis for Scale Transference
  • Mar 25, 2001
  • A Z Remacre + 1 more

The choice of adequate scaling procedures is a crucial issue to geological reservoir modeling. The integration of fine scale geophysical well logging and petrophysical data with seismic data, and the procedures of transferring information from detailed geological models to large scale, dynamic reservoir models have been theme of intense debate in the literature. In this work, we propose a new approach that is focused on filtering well logging data in order to control data entry for seismic inversion and subsequent reservoir modeling. Spectral analysis using Fast Fourier Transform is performed for a set of wells arranged around a single line along the axis of the reservoir. Lithological geophysical logging (GR, sonic, litho-density and neutronporosity log) were fourier-transformed and their power spectra were analyzed. Observed main frequencies were compared to geological derived information in order to retain only significant architectural elements of the reservoir. Logging data are filtered for these frequencies and the resulting medium scale features are used to model synthetic wells from which the data for seismic inversion were derived. This procedure minimizes higher frequency entries at a level where they can be adequately scrutinized under a very effective geological framework. The proposed approach effectively handles with the various scale levels of geophysical and geological acquisition by filtering information from detailed logging data, retaining important geologic features characterization, and avoiding the introduction of high frequency noises for seismic inversion algorithms.

  • Research Article
  • Cite Count Icon 17
  • 10.1007/s40328-022-00385-5
Adaptive boosting of random forest algorithm for automatic petrophysical interpretation of well logs
  • Jan 1, 2022
  • Acta Geodaetica et Geophysica
  • V Srivardhan

The power of Machine Learning is demonstrated for automatic interpretation of well logs and determining reservoir properties for volume of shale, porosity, and water saturation respectively for tight clastic sequences. Random Forest algorithms are reputed for their efficiency as they belong to a class of algorithms called ensemble methods, which are traditionally seen as weak learners, but can be transformed into strong performers and they promise to deliver highly accurate results. The study area is located offshore Australia in the Poseidon and Crown fields situated in the Browse Basin, which are gas fields in tight complex clastic reservoirs. There are 5 wells used in this study with one well manually interpreted which is subsequently used in developing a machine learning model which predicts the output for the other 4 wells. The basic open hole logs namely Natural gamma ray, Resistivity, Neutron Porosity, Bulk Density, P-wave and S-wave sonic travel-time, are used in interpretation. One of the wells has a missing S-wave travel-time log which was also predicted by developing a Random Forest Machine Learning model. The results indicate a very robust improvement in performance when Random Forest algorithm was combined with Adaptive Boosting when interpreting the well logs. The training accuracy using Random Forest alone was 98.21%, but testing was 77.62% which suggested over-fitting by the Random Forest model. The Adaptive Boosting of the Random Forest algorithm resulted in the overall training accuracy of 99.40% and an overall testing accuracy of 97.03%, indicating a drastic improvement in performance. S-wave travel-time log was predicted by preparing a training set consisting of Natural gamma ray, Resistivity, Neutron Porosity, Bulk Density, and P-wave travel-time logs for the 4 wells using Random Forest which gave a training accuracy of 99.79% and a testing accuracy of 98.54%. Machine learning algorithms can be successfully applied for interpreting well log data in complex sedimentary environment and their performance can be drastically improved using Adaptive Boosting.

  • Research Article
  • Cite Count Icon 2
  • 10.1080/10916460903330064
The Productivity Estimation of Designed Horizontal Oil and Gas Wells Before a Drilling Operation, Using Seismic and Petrophysical Parameters and Modeling
  • Oct 13, 2010
  • Petroleum Science and Technology
  • M Mostafazadeh + 3 more

For feasibility studies of field development projects and management of time, risk, and cost, it is crucial to forecast the productivity of oil and gas wells. In this research a precise and comprehensive methodology is presented to estimate productivity in horizontal drilling. Petrophysical parameters of one of the Iranian oilfields including effective porosity, clay percentage, water saturation, and permeability were investigated in horizontal and vertical sections with specific intervals in all the reservoir spots using the well logging data. The seismic parameters of the reservoir including frequency, amplitude, and phase of the reflected seismic waves were also investigated. Petrophysical Reservoir Hydrocarbon Potential Index (RHPi)p and Seismic Reservoir Hydrocarbon Potential Index (RHPi)s were obtained by the combining of the petrophysical and seismic parameters, respectively. The former index cannot solely represent the hydrocarbon quality of the reservoir precisely. In ideal condition, the two indices must be equal for a specific section of the reservoir. Another index, the Average Reservoir Hydrocarbon Potential Index (RHPi)av, which is actually the average of the aforesaid indices, was defined. This index ensures high accuracy. By applying and generalizing (RHPi)av to the whole reservoir, another index known as the Well Efficiency Index (WEI) was obtained. This index can be used to estimate and correct the productivity of the designed horizontal wells before drilling, comparing actual productivity of vertical wells drilled in the oilfield. The result is a range of productivity with a particular safety factor that has a high degree of accuracy and precision.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.3389/feart.2023.1047981
Forecast of lacustrine shale lithofacies types in continental rift basins based on machine learning: A case study from Dongying Sag, Jiyang Depression, Bohai Bay Basin, China
  • Apr 20, 2023
  • Frontiers in Earth Science
  • Zhengwei Fang + 2 more

Lacustrine shale in continental rift basins is complex and features a variety of mineralogical compositions and microstructures. The lithofacies type of shale, mainly determined by mineralogical composition and microstructure, is the most critical factor controlling the quality of shale oil reservoirs. Conventional geophysical methods cannot accurately forecast lacustrine shale lithofacies types, thus restricting the progress of shale oil exploration and development. Considering the lacustrine shale in the upper Es4 member of the Dongying Sag in the Jiyang Depression, Bohai Bay Basin, China, as the research object, the lithofacies type was forecast based on two machine learning methods: support vector machine (SVM) and extreme gradient boosting (XGBoost). To improve the forecast accuracy, we applied the following approaches: first, using core and thin section analyses of consecutively cored wells, the lithofacies were finely reclassified into 22 types according to mineralogical composition and microstructure, and the vertical change of lithofacies types was obtained. Second, in addition to commonly used well logging data, paleoenvironment parameter data (Rb/Sr ratio, paleoclimate parameter; Sr %, paleosalinity parameter; Ti %, paleoprovenance parameter; Fe/Mn ratio, paleo-water depth parameter; P/Ti ratio, paleoproductivity parameter) were applied to the forecast. Third, two sample extraction modes, namely, curve shape-to-points and point-to-point, were used in the machine learning process. Finally, the lithofacies type forecast was carried out under six different conditions. In the condition of selecting the curved shape-to-point sample extraction mode and inputting both well logging and paleoenvironment parameter data, the SVM method achieved the highest average forecast accuracy for all lithofacies types, reaching 68%, as well as the highest average forecast accuracy for favorable lithofacies types at 98%. The forecast accuracy for all lithofacies types improved by 7%–28% by using both well logging and paleoenvironment parameter data rather than using one or the other, and was 7%–8% higher by using the curve shape-to-point sample extraction mode compared to the point-to-point sample extraction mode. In addition, the learning sample quantity and data value overlap of different lithofacies types affected the forecast accuracy. The results of our study confirm that machine learning is an effective solution to forecast lacustrine shale lithofacies. When adopting machine learning methods, increasing the learning sample quantity (>45 groups), selecting the curve shape-to-point sample extraction mode, and using both well logging and paleoenvironment parameter data are effective ways to improve the forecast accuracy of lacustrine shale lithofacies types. The method and results of this study provide guidance to accurately forecast the lacustrine shale lithofacies types in new shale oil wells and will promote the harvest of lacustrine shale oil globally.

  • Research Article
  • Cite Count Icon 16
  • 10.1186/s12873-024-01135-2
Improving triage performance in emergency departments using machine learning and natural language processing: a systematic review
  • Nov 18, 2024
  • BMC Emergency Medicine
  • Bruno Matos Porto

BackgroundIn Emergency Departments (EDs), triage is crucial for determining patient severity and prioritizing care, typically using the Manchester Triage Scale (MTS). Traditional triage systems, reliant on human judgment, are prone to under-triage and over-triage, resulting in variability, bias, and incorrect patient classification. Studies suggest that Machine Learning (ML) and Natural Language Processing (NLP) could enhance triage accuracy and consistency. This review analyzes studies on ML and/or NLP algorithms for ED patient triage.MethodsFollowing Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines, we conducted a systematic review across five databases: Web of Science, PubMed, Scopus, IEEE Xplore, and ACM Digital Library, from their inception of each database to October 2023. The risk of bias was assessed using the Prediction model Risk of Bias Assessment Tool (PROBAST). Only articles employing at least one ML and/or NLP method for patient triage classification were included.ResultsSixty studies covering 57 ML algorithms were included. Logistic Regression (LR) was the most used model, while eXtreme Gradient Boosting (XGBoost), decision tree-based algorithms with Gradient Boosting (GB), and Deep Neural Networks (DNNs) showed superior performance. Frequent predictive variables included demographics and vital signs, with oxygen saturation, chief complaints, systolic blood pressure, age, and mode of arrival being the most retained. The ML algorithms showed significant bias risk due to critical bias assessment in classification models.ConclusionNLP methods improved ML algorithms' classification capability using triage nursing and medical notes and structured clinical data compared to algorithms using only structured data. Feature engineering (FE) and class imbalance correction methods enhanced ML workflows' performance, but FE and eXplainable Artificial Intelligence (XAI) were underexplored in this field.Registration and funding.This systematic review has been registered (registration number: CRD42024604529) in the International Prospective Register of Systematic Reviews (PROSPERO) and can be accessed online at the following URL: https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=604529. Funding for this work was provided by the National Council for Scientific and Technological Development (CNPq), Brazil.

  • Research Article
  • Cite Count Icon 2
  • 10.69882/adba.cs.2024073
Enhancing Anomaly Detection in Large-Scale Log Data Using Machine Learning: A Comparative Study of SVM and KNN Algorithms with HDFS Dataset
  • Jul 1, 2024
  • ADBA Computer Science
  • Yusuf Alaca + 2 more

As information technology rapidly advances, servers, mobile, and desktop applications are easily attacked due to their high value. Therefore, cyber attacks have raised great concerns in many areas. Anomaly detection plays a significant role in the field of cyber attacks, and log records, which record detailed system runtime information, have consequently become an important data analysis object. Traditional log anomaly detection relies on programmers manually inspecting logs through keyword searches and regular expression matching. While programmers can use intrusion detection systems to reduce their workload, log data is massive, attack types are diverse, and the advancement of hacking skills makes traditional detection inefficient. To improve traditional detection technology, many anomaly detection mechanisms, especially machine learning methods, have been proposed in recent years. In this study, an anomaly detection system using two different machine learning algorithms is proposed for large log data. Using Support Vector Machines (SVM) and K-Nearest Neighbors (KNN) algorithms, experiments were conducted with the Hadoop Distributed File System (HDFS) log dataset, and experimental results show that this system provides higher detection accuracy and can detect unknown anomaly data.

More from: Geofísica Internacional
  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1830
Prediction of permeability and effective porosity values using ANN in Maleh field
  • Jun 27, 2025
  • Geofísica Internacional
  • Mohammed Essa Nassani + 1 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1787
Application of Electrical Resistivity Tomography for Cost-Effective Planning in Diabase Gravel Mining Operations in Southeastern Brazil
  • Jun 27, 2025
  • Geofísica Internacional
  • Lenon Melo Ilha + 5 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1803
Petrophysical evaluation of clastic formations in boreholes with incomplete well log dataset by using joint inversion technique and machine learning algorithms
  • Jun 27, 2025
  • Geofísica Internacional
  • Felipe Santana-Román + 4 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1853
Focal mechanisms and tectonic stress field in the Valley of Mexico from local seismicity
  • Jun 27, 2025
  • Geofísica Internacional
  • Delia Iresine Bello Segura + 2 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1749
Seismicity Associated to a Shield Volcano in a Monogenetic Volcanic Field: The Case of San Martin Tuxtla Volcano, Veracruz, Mexico
  • Jun 27, 2025
  • Geofísica Internacional
  • Juan Manuel Espindola + 2 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1849
Pre-eruptive conditions of a magma mixing-assimilation systems at La Malinche volcano, Mexico
  • Jun 27, 2025
  • Geofísica Internacional
  • Johana Andrea Gomez Arango + 2 more

  • Journal Issue
  • 10.22201/igeof.2954436xe.2025.64.3
  • Jun 27, 2025
  • Geofísica Internacional

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1835
Determinación de la estructura sísmica de la cuenca Pantanal a partir del análisis de velocidad de fase: implementación del primer modo superior de ondas Rayleigh
  • Jun 27, 2025
  • Geofísica Internacional
  • Andrés D'Onofrio + 2 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.3.1827
Impact of 2019 Earthquakes on Shallow Aquifers in Northern sub-Himalayan Pakistan: A Detailed Analysis of Mirpur and Surrounding Areas
  • Jun 27, 2025
  • Geofísica Internacional
  • Abrar Niaz + 4 more

  • Research Article
  • 10.22201/igeof.2954436xe.2025.64.2.1804
Geomechanic modeling of seismic emission due to fracture growth - connection to microseismic source mechanisms
  • Apr 1, 2025
  • Geofísica Internacional
  • Sergey Yaskevich + 2 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon