Balanced Training Set Research Articles

Expensive multiobjective optimization problems pose great challenges to evolutionary algorithms due to their costly evaluation. Building cheap surrogate models to replace the expensive real models has been proved to be a practical way to reduce the number of costly evaluations. Supervised learning techniques from the community of machine learning have been widely applied to build either regressors, which approximate the fitness values of candidate solutions, or classifiers, which estimate the categories of candidate solutions. Considering the characteristics of the data produced in optimization, this article proposes a new surrogate model, called a relation model, for evolutionary multiobjective optimization. Instead of estimating the qualities of candidate solutions directly, the relation model tries to estimate the relationship between a pair of solutions <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\langle \mathbf {x}, \mathbf {y}\rangle $ </tex-math></inline-formula> , i.e., <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {x}$ </tex-math></inline-formula> dominates <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {y}$ </tex-math></inline-formula> , <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {x}$ </tex-math></inline-formula> is dominated by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {y}$ </tex-math></inline-formula> , or <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {x}$ </tex-math></inline-formula> is nondominated with <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {y}$ </tex-math></inline-formula> in the case of multiobjective optimization. To implement this idea, first a balanced training set is prepared, then a classifier is built based on the training data set to learn the relationship, and finally, the classifier with a voting-scoring strategy is applied to estimate the relationship between the candidate solutions and parent solutions. By this way, the promising candidate solutions are recognized and evaluated by the real models. The new approach is applied to three well-known benchmark suites and two real-world applications, and the experimental results suggest that the proposed method outperforms some state-of-the-art methods based on regression and classification models on the given instances.

BackgroundMaternal and neonatal health outcomes in low- and middle-income countries (LMICs) have improved over the last two decades. However, many pregnant women still deliver at home, which increases the health risks for both the mother and the child. Community health worker programs have been broadly employed in LMICs to connect women to antenatal care and delivery locations. More recently, employment of digital tools in maternal health programs have resulted in better care delivery and served as a routine mode of data collection. Despite the availability of rich, patient-level data within these digital tools, there has been limited utilization of this type of data to inform program delivery in LMICs.MethodsWe use program data from 38,787 women enrolled in Safer Deliveries, a community health worker program in Zanzibar, to build a generalizable prediction model that accurately predicts whether a newly enrolled pregnant woman will deliver in a health facility. We use information collected during the enrollment visit, including demographic data, health characteristics and current pregnancy information. We apply four machine learning methods: logistic regression, LASSO regularized logistic regression, random forest and an artificial neural network; and three sampling techniques to address the imbalanced data: undersampling of facility deliveries, oversampling of home deliveries and addition of synthetic home deliveries using SMOTE.ResultsOur models correctly predicted the delivery location for 68%–77% of the women in the test set, with slightly higher accuracy when predicting facility delivery versus home delivery. A random forest model with a balanced training set created using undersampling of existing facility deliveries accurately identified 74.4% of women delivering at home.ConclusionsThis model can provide a “real-time” prediction of the delivery location for new maternal health program enrollees and may enable early provision of extra support for individuals at risk of not delivering in a health facility, which has potential to improve health outcomes for both mothers and their newborns. The framework presented here is applicable in other contexts and the selection of input features can easily be adapted to match data availability and other outcomes, both within and beyond maternal health.

Balanced Training Set Research Articles

Related Topics

Articles published on Balanced Training Set

Probabilistic Contrastive Learning for Long-Tailed Visual Recognition.

Imbalanced Diagnosis Scheme for Incipient Rotor Faults in Inverter-Fed Induction Motors

Predicting Equity Crowdfunding Success

Prediction of cervix cancer stage and grade from diffusion weighted imaging using EfficientNet

Multiple adaptive over-sampling for imbalanced data evidential classification

Integrated algorithm evaluation of a scalable digital power management innovation model

Learning to Generate an Unbiased Scene Graph by Using Attribute-Guided Predicate Features

A novel anomaly detection approach based on ensemble semi-supervised active learning (ADESSA)

Balanced incremental deep reinforcement learning based on variational autoencoder data augmentation for customer credit scoring

An omnidirectional approach to touch-based continuous authentication

BADM-Net: Hierarchical Classification Network for Identifying Anomalous Trends in Bridge Monitoring Data Patterns

A Diffusion Model Based on Network Intrusion Detection Method for Industrial Cyber-Physical Systems.

Prediction of rock mass class ahead of TBM excavation face by ML and DL algorithms with Bayesian TPE optimization and SHAP feature analysis

Deleterious synonymous mutation identification based on selective ensemble strategy.

Evaluation of four machine learning models for signal detection.

Probabilistic learning for pulsar classification

Expensive Multiobjective Optimization by Relation Learning and Prediction

Evaluation of deep learning and transform domain feature extraction techniques for land cover classification: balancing through augmentation.

Machine learning for maternal health: Predicting delivery location in a community health worker program in Zanzibar.

Research on Improved Random Forest Algorithm for Highly Unbalanced Data

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Balanced Training Set Research Articles

Related Topics

Articles published on Balanced Training Set

Probabilistic Contrastive Learning for Long-Tailed Visual Recognition.

Imbalanced Diagnosis Scheme for Incipient Rotor Faults in Inverter-Fed Induction Motors

Predicting Equity Crowdfunding Success

Prediction of cervix cancer stage and grade from diffusion weighted imaging using EfficientNet

Multiple adaptive over-sampling for imbalanced data evidential classification

Integrated algorithm evaluation of a scalable digital power management innovation model

Learning to Generate an Unbiased Scene Graph by Using Attribute-Guided Predicate Features

A novel anomaly detection approach based on ensemble semi-supervised active learning (ADESSA)

Balanced incremental deep reinforcement learning based on variational autoencoder data augmentation for customer credit scoring

An omnidirectional approach to touch-based continuous authentication

BADM-Net: Hierarchical Classification Network for Identifying Anomalous Trends in Bridge Monitoring Data Patterns

A Diffusion Model Based on Network Intrusion Detection Method for Industrial Cyber-Physical Systems.

Prediction of rock mass class ahead of TBM excavation face by ML and DL algorithms with Bayesian TPE optimization and SHAP feature analysis

Deleterious synonymous mutation identification based on selective ensemble strategy.

Evaluation of four machine learning models for signal detection.

Probabilistic learning for pulsar classification

Expensive Multiobjective Optimization by Relation Learning and Prediction

Evaluation of deep learning and transform domain feature extraction techniques for land cover classification: balancing through augmentation.

Machine learning for maternal health: Predicting delivery location in a community health worker program in Zanzibar.

Research on Improved Random Forest Algorithm for Highly Unbalanced Data