The Effect of Clustering in Filter Method Results Applied in Medical Datasets

Nadjla Elong,Sidi Ahmed Rahal

doi:10.4018/ijhisi.2021010103

Abstract

For a deeper and richer analytic processing of medical datasets, feature selection aims to eliminate redundant and irrelevant features from the data. While filter has been touted as one of the simplest methods for feature selection, its applications have generally failed to identify and deal with embedded similarities among features. In this research, a hybrid approach for feature selection based on combining the filter method with the hierarchical agglomerative clustering method is proposed to eliminate irrelevant and redundant features in four medical datasets. A formal evaluation of the proposed approach unveils major improvements in the classification accuracy when results are compared to those obtained via only the applications of the filter methods and/or more classical-based feature selection approaches.

Highlights

In vying for a deeper and richer analytic processing of medical datasets, a key challenge in building a superior classification model via machine learning (ML) is the identification of a set of representative features that are inherently embedded in cumulative health datasets
Past research has investigated the applications of various feature selection methods that are of growing interests to the medical data analytics research community (Polat & Güneş, 2009; Akay, 2009; Shilaskar & Ghatol, 2013; Lavanya & Rani, 2011; Anbarasi, Anupriya & Iyengar, 2010; Inbarani, Azar & Jothi, 2014; Kumar, Ramachandra & Nagamani, 2014; Ibrahim, Ojo & Oluwafisoye, 2018)
While similarity is an amount that reflects the strength of relationship between two data items, dissimilarity deals with the measurement of divergence between two data items (Irani, Pise & Phatak, 2016). Based on these two methods, Filter methods and hierarchical agglomerative clustering algorithm (HAC) algorithm, we proposed an approach for feature selection

Summary

Introduction

In vying for a deeper and richer analytic processing of medical datasets, a key challenge in building a superior classification model via machine learning (ML) is the identification of a set of representative features that are inherently embedded in cumulative health datasets. This representative set of features should contain mostly relevant and non-redundant features so as to achieve improved accuracy and better classification results for data modeling. More recent filter-based feature selection approach such as the mRMR (minimum Redundancy Max Relevancy) has been designed for improved feature selection of microarray data. Other prominent feature selection approaches include a Fast Correlation Based Filter (FCBF) solution, FAST and other feature selection methods that used Genetic Algorithms (GAs), including Genetic Programming (GP) and Particle Swarm Optimization (PSO) approaches

Objectives

Methods

Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Healthcare Information Systems and Informatics	Publication Date: Jan 1, 2021
Citations: 3	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

The Effect of Clustering in Filter Method Results Applied in Medical Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Healthcare Information Systems and Informatics

Lead the way for us

Similar Papers

A binary Krill Herd approach based feature selection for high dimensional data
V Preeja ... A H Shahana
-
V Preeja, et. al.V Preeja ... A H Shahana
01 Aug 2016
01 Aug 2016

Machine Learning Aided Fused Feature Selection based Classification Framework for Diagnosing Cervical Cancer
B Nithya ... V Ilango
-
B Nithya, et. al.B Nithya ... V Ilango
01 Mar 2020
01 Mar 2020

Enhanced chimp hierarchy optimization algorithm with adaptive lens imaging for feature selection in data classification
Li Zhang ... Xiaobo Chen
Scientific Reports | VOL. 14
Li Zhang, et. al.Li Zhang ... Xiaobo Chen
22 Mar 2024
Scientific Reports | VOL. 14

A Novel Feature Selection Method Based on Salp Swarm Algorithm
Chaokun Yan ... Huimin Luo
-
Chaokun Yan, et. al.Chaokun Yan ... Huimin Luo
19 Mar 2021
19 Mar 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Effect of Clustering in Filter Method Results Applied in Medical Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Healthcare Information Systems and Informatics