A Novel Ensemble Learning Paradigm for Medical Diagnosis With Imbalanced Data

Na Liu,Xiaomei Li,Man Xu,Bo Gao,Ershi Qi,Ling Li

doi:10.1109/access.2020.3014362

Na Liu, Xiaomei Li + Show 4 more

Open Access

https://doi.org/10.1109/access.2020.3014362

Copy DOI

Abstract

With the help of machine learning (ML) techniques, the possible errors made by the pathologists and physicians, such as those caused by inexperience, fatigue, stress and so on can be avoided, and the medical data can be examined in a shorter time and in a more detailed manner. However, while the conventional ML techniques, such as classification, achieved excellent performance in classification accuracy when applied in medical diagnoses, they have a fatal shortcoming of poor performance since the imbalanced dataset, especially for the detection of the minority category. To tackle the shortcomings of conventional classification approaches, this study proposes a novel ensemble learning paradigm for medical diagnosis with imbalanced data, which consists of three phases: data pre-processing, training base classifier and final ensemble. In the first data pre-processing phase, we introduce the extension of Synthetic Minority Oversampling Technique (SMOTE) by integrating it with cross-validated committees filter (CVCF) technique, which can not only synthesize the minority sample and thereby balance the input instances, but also filter the noisy examples so as to perform well in the process of classification. In the classification phase, we introduce ensemble support vector machine (ESVM) classification technique, which were constructed by multiple diversity structures of SVM classifiers and thus has the advantages of strong generalization performance and classification precision. Additionally, in the last phase of the final ensemble strategy, we introduce the weighted majority voting strategy and introduce simulated annealing genetic algorithm (SAGA) to optimize the weight vector and thereby enhance the overall classification performance. The efficiency of our proposed ensemble learning method was tested on nine imbalanced medical datasets and the experimental results clearly indicate that the proposed ensemble learning paradigm outperforms other state-of-the-art classification models. Promisingly, our proposed ensemble learning paradigm can effectively facilitate medical decision making for physicians.

Highlights

The World Health Organization (WHO) reports that, cancer has been listed the second leading cause of death and there estimated that about 9.6 million people die from cancer worldwide in 2018, mostly in developing countries [1]
Motivated by the above deficiency, in this work, we only concentrate on research the binary classification problem and proposed a novel ensemble learning paradigm for medical diagnosis with imbalanced data, which consists of three phases
We proposed a novel ensemble learning paradigm for medical diagnosis with imbalanced data, which consists of three phases: data pre-processing, training base classifiers and final ensemble

Summary

Introduction

The World Health Organization (WHO) reports that, cancer has been listed the second leading cause of death and there estimated that about 9.6 million people die from cancer worldwide in 2018, mostly in developing countries [1]. 17.5 million people die each year from cardiovascular diseases (CVDs). In China, it was estimated that 214,360 women died from breast cancer by 2008 and the number of deaths will reach up to 2.5 million by 2021 [2]. Due to such a serious situation, the patients and their families suffer [3].

Objectives

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 46	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Novel Ensemble Learning Paradigm for Medical Diagnosis With Imbalanced Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Enhancing Machine Learning Models Through PCA, SMOTE-ENN, and Stochastic Weighted Averaging
Youngjin Han ... Inwhee Joe
Applied Sciences | VOL. 14
Youngjin Han, et. al.Youngjin Han ... Inwhee Joe
25 Oct 2024
Applied Sciences | VOL. 14

Automated semiconductor wafer defect classification dealing with imbalanced data
Po-Hsuan Lee ... Zhe Wang
-
Po-Hsuan Lee, et. al.Po-Hsuan Lee ... Zhe Wang
20 Mar 2020
20 Mar 2020

An empirical study on predictability of software maintainability using imbalanced data
Ruchika Malhotra ... Kusum Lata
Software Quality Journal | VOL. 28
Ruchika Malhotra, et. al.Ruchika Malhotra ... Kusum Lata
05 Aug 2020
Software Quality Journal | VOL. 28

Machine learning methods for imbalanced data set for prediction of faecal contamination in beach waters
Mathias Bourel ... Gonzalo Perera
Water Research | VOL. 202
Mathias Bourel, et. al.Mathias Bourel ... Gonzalo Perera
23 Jul 2021
Water Research | VOL. 202

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Ensemble Learning Paradigm for Medical Diagnosis With Imbalanced Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access