Driven Machine Learning Research Articles

Introduction: Blood count analysers are routinely used to assess health status and for disease monitoring. The major analysers generate more than 300 red cell, leukocyte and platelet parameters. However, in clinical practice, only 10-20 are utilised; a greater number are reviewed by laboratories but some are merely for ensuring instrument stability and quality assurance. Although there are hundreds more measurable parameters available through extended analytical channels, these are not assessed due to the complexity of the numerical analysis and questions as to the clinical utility. However, these extended analytical parameters have potential to be used to generate real-time ‘3-Dimensional (3-D)‘ details of blood cells. Aim: In this study we assessed 312 parameters generated from 5,800 patient samples (of all ages and genders) at a tertiary care hospital (National Institute of Blood Disease, Karachi-Pakistan). These were anonymously outputted and processed by machine learning (ML). Method: The methodology of the present study was based on the waterfall model. The output data from a haematology analyser (Sysmex XN-1000, Kobe Japan) in CSV format, having a total 433 columns, was pre-processed to remove unnecessary features (such as date, analysis details, rack position, receiving time, alerts and others) using Pandas and Numpy libraries, and where required scaling was performed. Data labeling was conducted on conclusions reported on their respective confirmatory tests. The processed data (which included 312 features of the total 433 columns) was labeled (with 67 conclusions on their confirmatory tests). The extracted data was fed to the artificial intelligence Machine Learning models through Python modules and libraries for training, testing, and validation purposes. The web application was developed on modern Python framework (Flask) to automate and provide an option of ‘drag and drop’ the CSV file exported from analyser, we connected pre-processing (data engineering), Machine Learning, and Prediction view by set of different tools (majorly JavaScript libraries). This generated various metrics including accuracy, precision, recall and the mean of precision and recall (F1 score) in relation to the prediction (results) from the CSV file submitted to our system. For each forecast the entire precision report and prediction box were added an optional visualisation on our web panel. Results: Analysis of 1.8 million data points (312 parameters x 5,800 samples) presented promising predictive potential, as, on principal component analysis (PCA) pilot the total variance was remained 41.6% showing that a linear combination of parameters can explain much variability. On a heat map the clustering and visualisation advocated the predictive potential and signatory deviational trends (fingerprints) respectively of these 3-D blood cell features. Examples included separation of myeloid from lymphoid, chronic from acute, bacteria from viral, deficiency of iron from deficiency from vitamin B12 / Folic acid, and differentiation of haemoglobinopathies. The patterns of normal, immature and abnormal blood cells under the title of cell population data was well demonstrated from results of our machine learning models. Of note, we observed an accuracy of 85.6% along with 91.2% precision for one of the ML models used (Random Forest Classifier). Conclusion: The opportunities and challenges of such high dimensional cell population data derived from a complete blood count can provide a novel patient-specific haematological fingerprint. This extended deviational patterning (fingerprint) can provide interpretive diagnostic data with practical disease-specific patterns. This pilot study shows that complete blood count data driven machine learning applications has great potential to uncover disease-associated patterns which could be applied in practice. It also has capacity to provide baseline testing would could assist in sequential health monitoring and potentially the generation of personal reference ranges.

Read full abstract

In this paper, a hybrid machine learning model is applied to evaluate the relationship between random initial states and the power system’s vulnerability to cascading outages. A cascading outage simulator (CS), which uses off-line AC power flows, is proposed for generating training data. The initial states are randomly selected and the CS model is deployed for each initial state, where power system generation and loads are adjusted dynamically and power flows are redistributed to quantify the vulnerability metric. Furthermore, the proposed hybrid machine learning model deploys a combined Support Vector Machine (SVM) classification and Gradient Boosting Regression (GBR) to improve the learning precision. The classification model is trained by SVM, which divides the data into two categories with and without load shedding. Then, GBR is adopted only for the data with load shedding to determine the relationship between input power outage states and the vulnerability metric. The proposed vulnerability analysis approach is applied to several test systems and the results are analyzed. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —The power system vulnerability can be quantified by cascading outage simulations. However, there are two challenges: i) there are a huge number of possible initial states and we cannot enumerate all these initial states for the cascading outage simulation. Neither can we precisely quantify the bus vulnerability. ii) The cascading outage simulation may be time-consuming for large-scale power systems, which is challenging for the online application. To address the above challenges, we expect to design a machine learning technique to predict the power system vulnerability, which can train the model in an offline way and then use it for the online application. Firstly, since there is not enough operation data from practical power systems, we develop a cascading outage simulator, using off-line AC power flows, for generating synthetic training data. Secondly, we observe that the training precision by directly applying the regression model may be very poor because the output of the machine learning model may take on an uneven distribution concerning input parameters. Thus, we propose a hybrid machine learning model with a combined classification and regression method, where the classification model is employed to remove the data without the load shedding, and the regression model then determines the relationship between input power outage states and the vulnerability metric. The proposed model and method have been tested on several systems including a practical large-scale Polish power system to show the effectiveness.

Read full abstract

Driven Machine Learning Research Articles

Related Topics

Articles published on Driven Machine Learning

Comparison of the diagnostic accuracy of resting-state fMRI driven machine learning algorithms in the detection of mild cognitive impairment

Blood Count Scattergrams Are Fingerprints of Blood: Using AI to Inform Health Status

Predicting real-time within-vehicle air pollution exposure with mass-balance and machine learning approaches using on-road and air quality data

Deep Learning–Based Summertime Turbulence Intensity Estimation Using Satellite Observations

Model and Data Driven Machine Learning Approach for Analyzing the Vulnerability to Cascading Outages With Random Initial States in Power Systems

Low-cycle fatigue life assessment of SAC solder alloy through a FEM-data driven machine learning approach

An application of ontology driven machine learning model challenges for the classification of social media data: a systematic literature review

Three-step learning strategy for designing 15Cr ferritic steels with enhanced strength and plasticity at elevated temperature

CAMERA-BASED REMOTE PHOTOPLETHYSMOGRAPHY TO PREDICT BLOOD PRESSURE IN CLINIC PATIENTS WITH CARDIOVASCULAR DISEASE

Design of high-performance high-entropy nitride ceramics via machine learning-driven strategy

Machine Learning Based Risk Prediction for Major Adverse Cardiovascular Events for ELGA-Authorized Clinics1.

Data driven machine learning models for short‐term load forecasting considering electrical vehicle load

Using machine learning to determine the time of exposure to infection by a respiratory pathogen

Prototype Theory Meets Word Embedding: A Novel Approach for Text Categorization via Granular Computing

Prediction of dose deposition matrix using voxel features driven machine learning approach.

Machine learning based fault detection and state of health estimation of proton exchange membrane fuel cells

MSR19 An Update on Real-Time Application of Machine Learning Programs to Improve Cardiovascular Risk Prediction in European Population

A data driven machine learning approach to differentiate between autism spectrum disorder and attention-deficit/hyperactivity disorder based on the best-practice diagnostic instruments for autism

Machine learning model to predict endophytic colonisation of rice cultivar plant tissues by Beauveria bassiana isolates and their potential as bio-control agents against rice stem borer using existing knowledge

Prediction of pullout interaction coefficient of geogrids by extreme gradient boosting model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Driven Machine Learning Research Articles

Related Topics

Articles published on Driven Machine Learning

Comparison of the diagnostic accuracy of resting-state fMRI driven machine learning algorithms in the detection of mild cognitive impairment

Blood Count Scattergrams Are Fingerprints of Blood: Using AI to Inform Health Status

Predicting real-time within-vehicle air pollution exposure with mass-balance and machine learning approaches using on-road and air quality data

Deep Learning–Based Summertime Turbulence Intensity Estimation Using Satellite Observations

Model and Data Driven Machine Learning Approach for Analyzing the Vulnerability to Cascading Outages With Random Initial States in Power Systems

Low-cycle fatigue life assessment of SAC solder alloy through a FEM-data driven machine learning approach

An application of ontology driven machine learning model challenges for the classification of social media data: a systematic literature review

Three-step learning strategy for designing 15Cr ferritic steels with enhanced strength and plasticity at elevated temperature

CAMERA-BASED REMOTE PHOTOPLETHYSMOGRAPHY TO PREDICT BLOOD PRESSURE IN CLINIC PATIENTS WITH CARDIOVASCULAR DISEASE

Design of high-performance high-entropy nitride ceramics via machine learning-driven strategy

Machine Learning Based Risk Prediction for Major Adverse Cardiovascular Events for ELGA-Authorized Clinics1.

Data driven machine learning models for short‐term load forecasting considering electrical vehicle load

Using machine learning to determine the time of exposure to infection by a respiratory pathogen

Prototype Theory Meets Word Embedding: A Novel Approach for Text Categorization via Granular Computing

Prediction of dose deposition matrix using voxel features driven machine learning approach.

Machine learning based fault detection and state of health estimation of proton exchange membrane fuel cells

MSR19 An Update on Real-Time Application of Machine Learning Programs to Improve Cardiovascular Risk Prediction in European Population

A data driven machine learning approach to differentiate between autism spectrum disorder and attention-deficit/hyperactivity disorder based on the best-practice diagnostic instruments for autism

Machine learning model to predict endophytic colonisation of rice cultivar plant tissues by Beauveria bassiana isolates and their potential as bio-control agents against rice stem borer using existing knowledge

Prediction of pullout interaction coefficient of geogrids by extreme gradient boosting model