Feature Ranking and Screening for Class-Imbalanced Metabolomics Data Based on Rank Aggregation Coupled with Re-Balance.

Guang-Hui Fu,Min-Jie Zong,Jia-Bao Wang,Lun-Zhao Yi

doi:10.3390/metabo11060389

Abstract

Feature screening is an important and challenging topic in current class-imbalance learning. Most of the existing feature screening algorithms in class-imbalance learning are based on filtering techniques. However, the variable rankings obtained by various filtering techniques are generally different, and this inconsistency among different variable ranking methods is usually ignored in practice. To address this problem, we propose a simple strategy called rank aggregation with re-balance (RAR) for finding key variables from class-imbalanced data. RAR fuses each rank to generate a synthetic rank that takes every ranking into account. The class-imbalanced data are modified via different re-sampling procedures, and RAR is performed in this balanced situation. Five class-imbalanced real datasets and their re-balanced ones are employed to test the RAR’s performance, and RAR is compared with several popular feature screening methods. The result shows that RAR is highly competitive and almost better than single filtering screening in terms of several assessing metrics. Performing re-balanced pretreatment is hugely effective in rank aggregation when the data are class-imbalanced.

Highlights

IntroductionIn the settings of binary category, a dataset is called “imbalanced” if the number of one class is far larger than the others in the training data
Datasets with imbalanced distribution are quite common in classification
A natural way to combat this challenge may combine each filtering approach’s information and relieve the effect of class imbalance. This is the motivation for why we propose the strategy of rank aggregation with re-balance

Summary

Introduction

In the settings of binary category, a dataset is called “imbalanced” if the number of one class is far larger than the others in the training data. The majority class is called negative while the minority class is called positive. A hindrance in class-imbalance learning is that standard classifiers are often biased towards the majority classes. Re-sampling is the standard strategy to deal with class-imbalance learning tasks. Many studies [2,3,4] have shown that re-sampling the dataset is an effective way to enhance the overall performance of the classification for several types of classifiers. Re-sampling methods concentrate on modifying the training set to make it suitable for a standard classifier. There are generally three types of re-sampling strategies to balance the class distribution: over-sampling, under-sampling, and hybrid sampling

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Metabolites	Publication Date: Jun 14, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Feature Ranking and Screening for Class-Imbalanced Metabolomics Data Based on Rank Aggregation Coupled with Re-Balance.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Metabolites

Lead the way for us

Similar Papers

Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study
Shirin Najdi ... José Manuel Fonseca
BioMedical Engineering OnLine | VOL. 16
Shirin Najdi, et. al.Shirin Najdi ... José Manuel Fonseca
01 Aug 2017
BioMedical Engineering OnLine | VOL. 16

Variable importance analysis based on rank aggregation with applications in metabolomics for biomarker discovery
Yong-Huan Yun ... Yi-Zeng Liang
Analytica Chimica Acta | VOL. 911
Yong-Huan Yun, et. al.Yong-Huan Yun ... Yi-Zeng Liang
07 Jan 2016
Analytica Chimica Acta | VOL. 911

Hellinger distance-based stable sparse feature selection for high-dimensional class-imbalanced data
Guang-Hui Fu ... Jianxin Pan
BMC Bioinformatics | VOL. 21
Guang-Hui Fu, et. al.Guang-Hui Fu ... Jianxin Pan
23 Mar 2020
BMC Bioinformatics | VOL. 21

Class imbalance learning via a fuzzy total margin based support vector machine
Hong-Liang Dai
Applied Soft Computing | VOL. 31
Hong-Liang DaiHong-Liang Dai
13 Mar 2015
Applied Soft Computing | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Feature Ranking and Screening for Class-Imbalanced Metabolomics Data Based on Rank Aggregation Coupled with Re-Balance.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Metabolites