Cancer Classification with a Cost-Sensitive Naive Bayes Stacking Ensemble.

Yueling Xiong,Changrong Wu,Mingquan Ye

doi:10.1155/2021/5556992

Abstract

Ensemble learning combines multiple learners to perform combinatorial learning, which has advantages of good flexibility and higher generalization performance. To achieve higher quality cancer classification, in this study, the fast correlation-based feature selection (FCBF) method was used to preprocess the data to eliminate irrelevant and redundant features. Then, the classification was carried out in the stacking ensemble learner. A library for support vector machine (LIBSVM), K-nearest neighbor (KNN), decision tree C4.5 (C4.5), and random forest (RF) were used as the primary learners of the stacking ensemble. Given the imbalanced characteristics of cancer gene expression data, the embedding cost-sensitive naive Bayes was used as the metalearner of the stacking ensemble, which was represented as CSNB stacking. The proposed CSNB stacking method was applied to nine cancer datasets to further verify the classification performance of the model. Compared with other classification methods, such as single classifier algorithms and ensemble algorithms, the experimental results showed the effectiveness and robustness of the proposed method in processing different types of cancer data. This method may therefore help guide cancer diagnosis and research.

Highlights

Cancer is a malignant tumor originating from epithelial tissues
Data were preprocessed first, and fast correlation-based feature selection (FCBF) was used in WEKA for feature selection
The results show that the proposed CSNB stacking method achieves the highest recall rate on all nine datasets, followed by the CSKNN stacking method

Summary

Introduction

Cancer is a malignant tumor originating from epithelial tissues. It is a disease caused by the loss of normal regulation and the excessive proliferation of cells in the body. Because the occurrence and development of cancer are dynamic, most patients are diagnosed with cancer in late stages, making clinical diagnosis and treatment challenging [1, 2]. With the continual development of DNA microarray technology, gene expression profile data are gathered by synchronously tracking the expression of many genes. Many features are irrelevant and redundant for classification in gene expression profiles. Massive computational challenges, such as high dimensionality, small sample sizes, high noise, and unbalanced categories, introduce difficulties in the analysis and processing of cancer gene data.

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational and Mathematical Methods in Medicine	Publication Date: Apr 26, 2021
Citations: 25	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Cancer Classification with a Cost-Sensitive Naive Bayes Stacking Ensemble.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational and Mathematical Methods in Medicine

Lead the way for us

Similar Papers

Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony
Lingyun Gao ... Mingquan Ye
Molecules | VOL. 22
Lingyun Gao, et. al.Lingyun Gao ... Mingquan Ye
29 Nov 2017
Molecules | VOL. 22

Feature Selection with Fast Correlation-Based Filter for Breast Cancer Prediction and Classification Using Machine Learning Algorithms
Youness Khourdifi ... Mohamed Bahaj
-
Youness Khourdifi, et. al.Youness Khourdifi ... Mohamed Bahaj
01 Nov 2018
01 Nov 2018

K-Nearest Neighbour Model Optimized by Particle Swarm Optimization and Ant Colony Optimization for Heart Disease Classification
Youness Khourdifi ... Mohamed Bahaj
-
Youness Khourdifi, et. al.Youness Khourdifi ... Mohamed Bahaj
01 Jan 2019
01 Jan 2019

Heart Disease Prediction and Classification Using Machine Learning Algorithms Optimized by Particle Swarm Optimization and Ant Colony Optimization
Aditya, Lalit And Mantosh Kumar
International Journal for Modern Trends in Science and Technology | VOL. 6
Aditya, Lalit And Mantosh Kumar Aditya, Lalit And Mantosh Kumar
18 Dec 2020
International Journal for Modern Trends in Science and Technology | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cancer Classification with a Cost-Sensitive Naive Bayes Stacking Ensemble.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational and Mathematical Methods in Medicine