Robust gene selection methods using weighting schemes for microarray data analysis

Suyeon Kang,Jongwoo Song

doi:10.1186/s12859-017-1810-x

Suyeon Kang, Jongwoo Song

Open Access

https://doi.org/10.1186/s12859-017-1810-x

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Sep 2, 2017
Citations: 15	License type: open-access

Affiliation: Ewha Womans University

Abstract

BackgroundA common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates.ResultsWe have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays.ConclusionsThe results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.

Highlights

A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states
We consider 7 different combinations of n1 and n2 in order to take into account the affects of sample size and class imbalance on gene selection performance as follows: (n1, n2) = (5, 5), (5, 10), (10, 5), (10, 10), (10, 15), (15, 10) and (15, 15)
This example illustrates the structure of noisy data containing outliers

Summary

Introduction

A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Microarray technologies allow us to measure the expression levels of thousands of genes simultaneously Analysis on such high-throughput data is not new, but it is still useful for statistical testing, which is a crucial part of transcriptomic research. A common task in microarray data analysis is to detect genes that are differentially expressed between experimental conditions or biological phenotype. Kang and Song BMC Bioinformatics (2017) 18:389 methods have been dominant over the past decades due to its strong advantages, they are the earliest in the literature [11,12,13, 15, 16] They are preferred by biology and molecular domain experts as the results generated by feature ranking techniques are intuitive and easy to understand. We focus on the filter method in this study

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust gene selection methods using weighting schemes for microarray data analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

A comprehensive evaluation of SAM, the SAM R-package and a simple modification to improve its performance.
Shunpu Zhang
BMC Bioinformatics | VOL. 8
Shunpu ZhangShunpu Zhang
29 Jun 2007
BMC Bioinformatics | VOL. 8

Differential expression of microRNAs in mouse liver under aberrant energy metabolic status
Shengjie Li ... Chen-Yu Zhang
Journal of Lipid Research | VOL. 50
Shengjie Li, et. al.Shengjie Li ... Chen-Yu Zhang
01 Sep 2009
Journal of Lipid Research | VOL. 50

Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data.
Ian B Jeffery ... Aedín C Culhane
BMC bioinformatics | VOL. 7
Ian B Jeffery, et. al.Ian B Jeffery ... Aedín C Culhane
26 Jul 2006
BMC bioinformatics | VOL. 7

A Two-Way Semilinear Model for Normalization and Analysis of cDNA Microarray Data
Jian Huang ... Cun-Hui Zhang
Journal of the American Statistical Association | VOL. 100
Jian Huang, et. al.Jian Huang ... Cun-Hui Zhang
01 Sep 2005
Journal of the American Statistical Association | VOL. 100

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust gene selection methods using weighting schemes for microarray data analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics