Cross-validation Experiments Research Articles

BackgroundImbalance between positive and negative outcomes, a so-called class imbalance, is a problem generally found in medical data. Despite various studies, class imbalance has always been a difficult issue. The main objective of this study was to find an effective integrated approach to address the problems posed by class imbalance and to validate the method in an early screening model for a rare cardiovascular disease aortic dissection (AD).MethodsDifferent data-level methods, cost-sensitive learning, and the bagging method were combined to solve the problem of low sensitivity caused by the imbalance of two classes of data. First, feature selection was applied to select the most relevant features using statistical analysis, including significance test and logistic regression. Then, we assigned two different misclassification cost values for two classes, constructed weak classifiers based on the support vector machine (SVM) model, and integrated the weak classifiers with undersampling and bagging methods to build the final strong classifier. Due to the rarity of AD, the data imbalance was particularly prominent. Therefore, we applied our method to the construction of an early screening model for AD disease. Clinical data of 523,213 patients from the Institute of Hypertension, Xiangya Hospital, Central South University were used to verify the validity of this method. In these data, the sample ratio of AD patients to non-AD patients was 1:65, and each sample contained 71 features.ResultsThe proposed ensemble model achieved the highest sensitivity of 82.8%, with training time and specificity reaching 56.4 s and 71.9% respectively. Additionally, it obtained a small variance of sensitivity of 19.58 × 10–3 in the seven-fold cross validation experiment. The results outperformed the common ensemble algorithms of AdaBoost, EasyEnsemble, and Random Forest (RF) as well as the single machine learning (ML) methods of logistic regression, decision tree, k nearest neighbors (KNN), back propagation neural network (BP) and SVM. Among the five single ML algorithms, the SVM model after cost-sensitive learning method performed best with a sensitivity of 79.5% and a specificity of 73.4%.ConclusionsIn this study, we demonstrate that the integration of feature selection, undersampling, cost-sensitive learning and bagging methods can overcome the challenge of class imbalance in a medical dataset and develop a practical screening model for AD, which could lead to a decision support for screening for AD at an early stage.

18F-fluorodeoxyglucose (FDG)-positron emission tomography (PET) reveals altered brain metabolism in individuals with mild cognitive impairment (MCI) and Alzheimer’s disease (AD). Some biomarkers derived from FDG-PET by computer-aided-diagnosis (CAD) technologies have been proved that they can accurately diagnosis normal control (NC), MCI, and AD. However, existing FDG-PET-based researches are still insufficient for the identification of early MCI (EMCI) and late MCI (LMCI). Compared with methods based other modalities, current methods with FDG-PET are also inadequate in using the inter-region-based features for the diagnosis of early AD. Moreover, considering the variability in different individuals, some hard samples which are very similar with both two classes limit the classification performance. To tackle these problems, in this paper, we propose a novel bilinear pooling and metric learning network (BMNet), which can extract the inter-region representation features and distinguish hard samples by constructing the embedding space. To validate the proposed method, we collect 898 FDG-PET images from Alzheimer’s disease neuroimaging initiative (ADNI) including 263 normal control (NC) patients, 290 EMCI patients, 147 LMCI patients, and 198 AD patients. Following the common preprocessing steps, 90 features are extracted from each FDG-PET image according to the automatic anatomical landmark (AAL) template and then sent into the proposed network. Extensive fivefold cross-validation experiments are performed for multiple two-class classifications. Experiments show that most metrics are improved after adding the bilinear pooling module and metric losses to the Baseline model respectively. Specifically, in the classification task between EMCI and LMCI, the specificity improves 6.38% after adding the triple metric loss, and the negative predictive value (NPV) improves 3.45% after using the bilinear pooling module. In addition, the accuracy of classification between EMCI and LMCI achieves 79.64% using imbalanced FDG-PET images, which illustrates that the proposed method yields a state-of-the-art result of the classification accuracy between EMCI and LMCI based on PET images.

Cross-validation Experiments Research Articles

Related Topics

Articles published on Cross-validation Experiments

Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection

Prediction of Pulmonary Function Parameters Based on a Combination Algorithm.

Cross-validation of a semantic segmentation network for natural history collection specimens

An active semi-supervised deep learning model for human activity recognition

ConCeptCNN: A novel multi-filter convolutional neural network for the prediction of neurodevelopmental disorders using brain connectome.

User-oriented Natural Human-Robot Control with Thin-Plate Splines and LRCN

BMNet: A New Region-Based Metric Learning Method for Early Alzheimer's Disease Identification With FDG-PET Images.

SAAED: Embedding and Deep Learning Enhance Accurate Prediction of Association Between circRNA and Disease.

A Robust Cybersecurity Topic Classification Tool

Series Arc Fault Detection in a Low-Voltage Power System Based on CEEMDAN Decomposition and Sensitive IMF Selection

FB-CGANet: filter bank channel group attention network for multi-class motor imagery classification

A Modified XG Boost Classifier Model for Detection of Seizures and Non-Seizures

M2PP: a novel computational model for predicting drug-targeted pathogenic proteins

Ligand-based discovery of new potential acetylcholinesterase inhibitors for Alzheimer’s disease treatment

A General and Scalable Vision Framework for Functional Near-Infrared Spectroscopy Classification.

Large-scale Text Multiclass Classification Using Spark ML Packages

Predicting spatiotemporal variability in radial tree growth at the continental scale with machine learning

Hepatitis C Virus Detection Model by Using Random Forest, Logistic-Regression and ABC Algorithm

Predicting miRNA-Disease Associations Based On Multi-View Variational Graph Auto-Encoder With Matrix Factorization.

Spatial-Temporal Feature Fusion Neural Network for EEG-Based Emotion Recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Cross-validation Experiments Research Articles

Related Topics

Articles published on Cross-validation Experiments

Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection

Prediction of Pulmonary Function Parameters Based on a Combination Algorithm.

Cross-validation of a semantic segmentation network for natural history collection specimens

An active semi-supervised deep learning model for human activity recognition

ConCeptCNN: A novel multi-filter convolutional neural network for the prediction of neurodevelopmental disorders using brain connectome.

User-oriented Natural Human-Robot Control with Thin-Plate Splines and LRCN

BMNet: A New Region-Based Metric Learning Method for Early Alzheimer's Disease Identification With FDG-PET Images.

SAAED: Embedding and Deep Learning Enhance Accurate Prediction of Association Between circRNA and Disease.

A Robust Cybersecurity Topic Classification Tool

Series Arc Fault Detection in a Low-Voltage Power System Based on CEEMDAN Decomposition and Sensitive IMF Selection

FB-CGANet: filter bank channel group attention network for multi-class motor imagery classification

A Modified XG Boost Classifier Model for Detection of Seizures and Non-Seizures

M2PP: a novel computational model for predicting drug-targeted pathogenic proteins

Ligand-based discovery of new potential acetylcholinesterase inhibitors for Alzheimer’s disease treatment

A General and Scalable Vision Framework for Functional Near-Infrared Spectroscopy Classification.

Large-scale Text Multiclass Classification Using Spark ML Packages

Predicting spatiotemporal variability in radial tree growth at the continental scale with machine learning

Hepatitis C Virus Detection Model by Using Random Forest, Logistic-Regression and ABC Algorithm

Predicting miRNA-Disease Associations Based On Multi-View Variational Graph Auto-Encoder With Matrix Factorization.

Spatial-Temporal Feature Fusion Neural Network for EEG-Based Emotion Recognition