Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments

Suyan Tian,Mayte Suárez-Fariñas

doi:10.1371/journal.pone.0078302

Abstract

BackgroundAs microarray technology has become mature and popular, the selection and use of a small number of relevant genes for accurate classification of samples has arisen as a hot topic in the circles of biostatistics and bioinformatics. However, most of the developed algorithms lack the ability to handle multiple classes, arguably a common application. Here, we propose an extension to an existing regularization algorithm, called Threshold Gradient Descent Regularization (TGDR), to specifically tackle multi-class classification of microarray data. When there are several microarray experiments addressing the same/similar objectives, one option is to use a meta-analysis version of TGDR (Meta-TGDR), which considers the classification task as a combination of classifiers with the same structure/model while allowing the parameters to vary across studies. However, the original Meta-TGDR extension did not offer a solution to the prediction on independent samples. Here, we propose an explicit method to estimate the overall coefficients of the biomarkers selected by Meta-TGDR. This extension permits broader applicability and allows a comparison between the predictive performance of Meta-TGDR and TGDR using an independent testing set.ResultsUsing real-world applications, we demonstrated the proposed multi-TGDR framework works well and the number of selected genes is less than the sum of all individualized binary TGDRs. Additionally, Meta-TGDR and TGDR on the batch-effect adjusted pooled data approximately provided same results. By adding Bagging procedure in each application, the stability and good predictive performance are warranted.ConclusionsCompared with Meta-TGDR, TGDR is less computing time intensive, and requires no samples of all classes in each study. On the adjusted data, it has approximate same predictive performance with Meta-TGDR. Thus, it is highly recommended.

Highlights

Biomarker discovery from high-dimensional data is a crucial problem with enormous applications in areas of biomedical research and translational medicine
The proposed algorithm, the Meta Threshold Gradient Descent Regularization (Meta-TGDR), assumes that the same set of genes is selected on all studies, while allowing the b coefficients to vary across studies, in a meta-analysis fashion
As criticized by Wang et al [22], lack of parsimony is an obvious disadvantage of TGDR algorithms, a shortcoming inherited by the multi-TGDR

Summary

Introduction

Biomarker discovery from high-dimensional data is a crucial problem with enormous applications in areas of biomedical research and translational medicine. Selecting a small number of relevant features (e.g., genes in transcriptomics profiles, SNPs in GWAs studies, and metabolites in metabolomics) to build a predictive model that can accurately classify samples by their diagnosis (e.g., diseased or health, different stages of one specific cancer) and prognosis (e.g., potential response to a given treatment, 5-year survival with a certain treatment) is an essential step towards personalized medicine. In bioinformatics, such a task is accomplished by a feature selection algorithm, which besides reducing over-fitting and improving classification accuracy, leads to small molecular signatures with manageable experimental verification and the potential design of cheap dedicated diagnostic and prognostic tools. This extension permits broader applicability and allows a comparison between the predictive performance of Meta-TGDR and TGDR using an independent testing set

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Nov 19, 2013
Citations: 39	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Hierarchical-TGDR
Suyan Tian ... Mayte Suárez-Fariñas
Systems Biomedicine | VOL. 1
Suyan Tian, et. al.Suyan Tian ... Mayte Suárez-Fariñas
01 Oct 2013
Systems Biomedicine | VOL. 1

Multi-TGDR, a multi-class regularization method, identifies the metabolic profiles of hepatocellular carcinoma and cirrhosis infected with hepatitis B or hepatitis C virus
Suyan Tian ... Chi Wang
BMC Bioinformatics | VOL. 15
Suyan Tian, et. al.Suyan Tian ... Chi Wang
04 Apr 2014
BMC Bioinformatics | VOL. 15

Validation and verification of regression in small data sets
Harald A Martens ... Pierre Dardenne
Chemometrics and Intelligent Laboratory Systems | VOL. 44
Harald A Martens, et. al.Harald A Martens ... Pierre Dardenne
01 Dec 1998
Chemometrics and Intelligent Laboratory Systems | VOL. 44

Multi-class acoustic event classification of hydrophone data
Gorkem Cipli ... Farook Sattar
-
Gorkem Cipli, et. al.Gorkem Cipli ... Farook Sattar
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE