A new regularized least squares support vector regression for gene selection

Pei-Chun Chen,Su-Yun Huang,Wei J Chen,Chuhsing K Hsiao

doi:10.1186/1471-2105-10-44

Abstract

BackgroundSelection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects. In addition to the curse of dimensionality, many gene selection methods weight the contribution from each individual subject equally. This equal-contribution assumption cannot account for the possible dependence among subjects who associate similarly to the disease, and may restrict the selection of influential genes.ResultsA novel approach to gene selection is proposed based on kernel similarities and kernel weights. We do not assume uniformity for subject contribution. Weights are calculated via regularized least squares support vector regression (RLS-SVR) of class levels on kernel similarities and are used to weight subject contribution. The cumulative sum of weighted expression levels are next ranked to select responsible genes. These procedures also work for multiclass classification. We demonstrate this algorithm on acute leukemia, colon cancer, small, round blue cell tumors of childhood, breast cancer, and lung cancer studies, using kernel Fisher discriminant analysis and support vector machines as classifiers. Other procedures are compared as well.ConclusionThis approach is easy to implement and fast in computation for both binary and multiclass problems. The gene set provided by the RLS-SVR weight-based approach contains a less number of genes, and achieves a higher accuracy than other procedures.

Highlights

Selection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects
We introduce the proposed gene selection algorithm, discuss briefly the regularized least squares support vector regression (RLS-SVR), and outline classification rules based on the selected genes
In the following we introduce the principle of the proposed gene selection procedures, and illustrate the regularized least squares (RLS)-SVR algorithm for assigning weights and support vector machines (SVMs) classification

Summary

Introduction

Selection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects. Golub et al [1] and Brown et al [2] considered the classification of known disease status (called class prediction or supervised learning) using microarray data These gene expression values are recorded from a large number of genes, where only a small subset is associated with the disease class labels. In the community of machine learning, many procedures, termed as gene selection, variable selection, or feature selection, have been developed to identify or to select a subset of genes with distinctive features Both the proportion of "relevant" genes and the number of tissues (subjects) are usually small, as compared to the number of genes, and lead to difficulties in finding a stable solution. The dimension reduction for gene selection as well as for finding influential genes is essential

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Feb 3, 2009
Citations: 52	License type: cc-by

R Discovery Prime

R Discovery Prime

A new regularized least squares support vector regression for gene selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Gene selection and classification from microarray data using kernel machine
Ji-Hoon Cho ... In-Beum Lee
FEBS Letters | VOL. 571
Ji-Hoon Cho, et. al.Ji-Hoon Cho ... In-Beum Lee
06 Jul 2004
FEBS Letters | VOL. 571

Fine-needle aspiration biopsy of small round blue cell tumors of childhood.
Barbara E Mcgahey ... Ann T Moriarty
Cancer | VOL. 69
Barbara E Mcgahey, et. al.Barbara E Mcgahey ... Ann T Moriarty
15 Feb 1992
Cancer | VOL. 69

Rhabdomyosarcoma and Extraosseous Ewing Sarcoma.
Juan P Gurria ... Roshni Dasgupta
Children | VOL. 5
Juan P Gurria, et. al.Juan P Gurria ... Roshni Dasgupta
10 Dec 2018
Children | VOL. 5

A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data
Rabia Aziz ... Namita Srivastava
Genomics Data | VOL. 8
Rabia Aziz, et. al.Rabia Aziz ... Namita Srivastava
23 Feb 2016
Genomics Data | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new regularized least squares support vector regression for gene selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics