Combination of Reduction Detection Using TOPSIS for Gene Expression Data Analysis

Jogeswar Tripathy,Rasmita Dash,Sambit Kumar Mishra,Binod Kumar Pattanayak,Tapas Kumar Mishra,Deepak Puthal

doi:10.3390/bdcc6010024

Abstract

In high-dimensional data analysis, Feature Selection (FS) is one of the most fundamental issues in machine learning and requires the attention of researchers. These datasets are characterized by huge space due to a high number of features, out of which only a few are significant for analysis. Thus, significant feature extraction is crucial. There are various techniques available for feature selection; among them, the filter techniques are significant in this community, as they can be used with any type of learning algorithm and drastically lower the running time of optimization algorithms and improve the performance of the model. Furthermore, the application of a filter approach depends on the characteristics of the dataset as well as on the machine learning model. Thus, to avoid these issues in this research, a combination of feature reduction (CFR) is considered designing a pipeline of filter approaches for high-dimensional microarray data classification. Considering four filter approaches, sixteen combinations of pipelines are generated. The feature subset is reduced in different levels, and ultimately, the significant feature set is evaluated. The pipelined filter techniques are Correlation-Based Feature Selection (CBFS), Chi-Square Test (CST), Information Gain (InG), and Relief Feature Selection (RFS), and the classification techniques are Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), and k-Nearest Neighbor (k-NN). The performance of CFR depends highly on the datasets as well as on the classifiers. Thereafter, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method is used for ranking all reduction combinations and evaluating the superior filter combination among all.

Highlights

Over the years, researchers have been trying with microarray technology to track gene expression on a genomic scale
Inspired by the above analysis, which is discussed by several researchers, this paper proposes a pipeline of reduction combinations using filter approaches
K-NN [28,29] chooses the class value of a new instance by examining a set of the k closest instances, as shown in Equation (6) in the training set and selecting the most frequent class value among them, with k set to five and Euclidean distance matrices used to calculate the similarity between two points. It stores the query data based on a similarity measure and the training data. k-Nearest Neighbor (k-NN) parameter tuning is performed to improve the performance by selecting an appropriate value of k

Summary

Introduction

Researchers have been trying with microarray technology to track gene expression on a genomic scale. Cancer diagnosis and classification are possible through examining the expression of genes. The use of microarray technology to analyze gene expression has opened up a world of possibilities for studying cell and organism biology [1]. Every researcher primarily focuses especially on the behavior of genes across the conditions of the experiment studied; recently, biomedical applications have fueled both the use of available technologies and the efficient implementation of new analytical tools to deal with these complex data. Microarray data analysis yields useful results that aid in the resolution of gene expression problems. Cancer categorization is one of the most significant uses of microarray data analysis. This reflects variations in the levels of expression of various genes.

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Big data and cognitive computing	Publication Date: Feb 23, 2022
Citations: 16	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Combination of Reduction Detection Using TOPSIS for Gene Expression Data Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Big data and cognitive computing

Lead the way for us

Similar Papers

Development of majority vote ensemble feature selection algorithm augmented with rank allocation to enhance Turkish text categorization
Emin Borandağ ... Yeşim Kaygusuz
Turkish Journal of Electrical Engineering and Computer Sciences | VOL. 29
Emin Borandağ, et. al.Emin Borandağ ... Yeşim Kaygusuz
30 Mar 2021
Turkish Journal of Electrical Engineering and Computer Sciences | VOL. 29

Evaluation Of Feature Selection for Improvement Backpropagation Neural Network in Divorce Predictions
Manaris Simanjuntak ... Guruh Fajar Shidik
-
Manaris Simanjuntak, et. al.Manaris Simanjuntak ... Guruh Fajar Shidik
19 Sep 2020
19 Sep 2020

The suppliers’ selection process through Extended Fuzzy Cognitive Maps and the Technique for Order of Preference by Similarity to Ideal Solution
...
The Journal of Modern Project Management | VOL. 7
, et. al. ...
13 Dec 2019
The Journal of Modern Project Management | VOL. 7

High-Performance Feature Selection Model for Network Intrusion Detection System
...
Special Issue | VOL. 8
, et. al. ...
22 Nov 2019
Special Issue | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combination of Reduction Detection Using TOPSIS for Gene Expression Data Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Big data and cognitive computing