A Parallel Multilevel Feature Selection algorithm for improved cancer classification

Lokeswari Venkataramana,Shomona Gracia Jacob,Rajavel Ramadoss

doi:10.1016/j.jpdc.2019.12.015

Abstract

Biological data is prone to grow exponentially, which consumes more resources, time and manpower. Parallelization of algorithms could reduce overall execution time. There are two main challenges in parallelizing computational methods. (1) Biological data is multi-dimensional in nature. (2). Parallel algorithms reduce execution time, but with the penalty of reduced prediction accuracy. This research paper targets these two issues and proposes the following approaches. (1) Vertical partitioning of data along feature space and horizontal partitioning along samples in order to ease the task of data parallelism. (2) Parallel Multilevel Feature Selection (M-FS) algorithm to select optimal and important features for improved classification of cancer sub-types. The selected features are evaluated using parallel Random Forest on Spark, compared with previously reported results and also with the results of sequential execution of same algorithms. The proposed parallel M-FS algorithm was compared with existing parallel feature selection algorithms in terms of accuracy and execution time. The results reveal that parallel multilevel feature selection algorithm improved cancer classification resulting into prediction accuracy ranging from ∼85% to ∼99% with very high speed up in terms of seconds. On the other hand, existing sequential algorithms yielded prediction accuracy of ∼65% to ∼99% with execution time of more than 24 hours.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Parallel Multilevel Feature Selection algorithm for improved cancer classification

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing

Lead the way for us

Journal: Journal of Parallel and Distributed Computing	Publication Date: Dec 28, 2019
Citations: 14

Similar Papers

Conference photo
-
Computer Physics Communications | VOL. 179
--
15 Feb 2008
Computer Physics Communications | VOL. 179

Parallel dual-channel multi-label feature selection
Jiali Miao ... Yusheng Cheng
Soft Computing | VOL. 27
Jiali Miao, et. al.Jiali Miao ... Yusheng Cheng
25 Feb 2023
Soft Computing | VOL. 27

Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data.
Lokeswari Venkataramana ... Dommaraju Haritha
Genes & genomics | VOL. 41
Lokeswari Venkataramana, et. al.Lokeswari Venkataramana ... Dommaraju Haritha
19 Aug 2019
Genes & genomics | VOL. 41

Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms.
Yuanyuan Han ... Fengfeng Zhou
Genes | VOL. 12
Yuanyuan Han, et. al.Yuanyuan Han ... Fengfeng Zhou
18 Nov 2021
Genes | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Parallel Multilevel Feature Selection algorithm for improved cancer classification

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing