A framework model using multifilter feature selection to enhance colon cancer classification.

Murad Al-Rajab,Joan Lu,Qiang Xu

doi:10.1371/journal.pone.0249094

Murad Al-Rajab, Joan Lu + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0249094

Copy DOI

Journal: PLOS ONE	Publication Date: Apr 16, 2021
Citations: 20	License type: CC BY 4.0

Affiliation: University of Huddersfield

Abstract

Gene expression profiles can be utilized in the diagnosis of critical diseases such as cancer. The selection of biomarker genes from these profiles is significant and crucial for cancer detection. This paper presents a framework proposing a two-stage multifilter hybrid model of feature selection for colon cancer classification. Colon cancer is being extremely common nowadays among other types of cancer. There is a need to find fast and an accurate method to detect the tissues, and enhance the diagnostic process and the drug discovery. This paper reports on a study whose objective has been to improve the diagnosis of cancer of the colon through a two-stage, multifilter model of feature selection. The model described deals with feature selection using a combination of Information Gain and a Genetic Algorithm. The next stage is to filter and rank the genes identified through this method using the minimum Redundancy Maximum Relevance (mRMR) technique. The final phase is to further analyze the data using correlated machine learning algorithms. This two-stage approach, which involves the selection of genes before classification techniques are used, improves success rates for the identification of cancer cells. It is found that Decision Tree, K-Nearest Neighbor, and Naïve Bayes classifiers had showed promising accurate results using the developed hybrid framework model. It is concluded that the performance of our proposed method has achieved a higher accuracy in comparison with the existing methods reported in the literatures. This study can be used as a clue to enhance treatment and drug discovery for the colon cancer cure.

Highlights

Cancer is reckoned, by the World Health Organisation (WHO), to be the secondmost communal source of death in the world [1]
The least accurate algorithm was Support Vector Machines (SVM) (81.25%), whilst the level of performance achieved by Naïve Bayes (NB) (87.5%) was acceptable; 2) for the dataset 2 was that NB performed the best with a classification accuracy measured at (100%) under the implication of the two-stage model
A confusion matrix records True Positives (TP), which are the number of successfully identified positive samples, True Negatives (TN), which are the number of correctly identified negative samples, False Positives (FP), the samples erroneously diagnosed as being positive, and False Negatives (FN), those positive samples wrongly diagnosed as negative

Summary

Introduction

Cancer is reckoned, by the World Health Organisation (WHO), to be the secondmost communal source of death in the world [1]. The first one was collected from Alon et al [71], which has been used in several colon cancer research studies [18, 46,47,48,49,50,51,52,53,54,55,56,57, 59, 72] This dataset is publicly available and is still utilized in most recent studies [22,23,24, 35, 57, 56, 73,74,75,76,77].

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A framework model using multifilter feature selection to enhance colon cancer classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

A hybrid machine learning feature selection model-HMLFSM to enhance gene classification applied to multiple colon cancers dataset.
Murad Al-Rajab ... Emad Shuweikeh
PloS one | VOL. 18
Murad Al-Rajab, et. al.Murad Al-Rajab ... Emad Shuweikeh
02 Nov 2023
PloS one | VOL. 18

Energy optimization for wireless sensor network using minimum redundancy maximum relevance feature selection and classification techniques
Muteeah Aljawarneh ... Ahmed Zouinkhi
PeerJ Computer Science | VOL. 10
Muteeah Aljawarneh, et. al.Muteeah Aljawarneh ... Ahmed Zouinkhi
30 Apr 2024
PeerJ Computer Science | VOL. 10

A new multi-colony fairness algorithm for feature selection
Xiang Feng ... Tan Yang
Soft Computing | VOL. 21
Xiang Feng, et. al.Xiang Feng ... Tan Yang
06 Jul 2016
Soft Computing | VOL. 21

Unsupervised robust Bayesian feature selection
Jianyong Sun ... Aimin Zhou
-
Jianyong Sun, et. al.Jianyong Sun ... Aimin Zhou
01 Jul 2014
01 Jul 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A framework model using multifilter feature selection to enhance colon cancer classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE