A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection

Michele Ceccarelli,Angelo Facchiano,Antonio D'Acierno

doi:10.1186/1471-2105-10-s12-s9

Abstract

BackgroundMass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. Therefore the development of bio-informatics methods based on unsupervised feature extraction can lead to general tools which can be applied to several fields of predictive proteomics.ResultsWe propose a method for feature selection and extraction grounded on the theory of multi-scale spaces for high resolution spectra derived from analysis of serum. Then we use support vector machines for classification. In particular we use a database containing 216 samples spectra divided in 115 cancer and 91 control samples. The overall accuracy averaged over a large cross validation study is 98.18. The area under the ROC curve of the best selected model is 0.9962.ConclusionWe improved previous known results on the problem on the same data, with the advantage that the proposed method has an unsupervised feature selection phase. All the developed code, as MATLAB scripts, can be downloaded from

Highlights

Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data
The proposed feature extraction and classification method has been tested on a dataset available from the In Figure 1 the overall process used to test our solution is shown
The paper presented the results obtained by applying a feature extraction procedure for mass spectra classification based on a scale-space analysis of the data

Summary

Introduction

Widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. As a more general case, a whole protein could be expressed or not under pathological conditions In all these cases, the proteomics pattern analyzed by mass spectrometry techniques can evidence differences due to the pathology. Comparative proteomics can be exploited to evaluate the effects of a specific therapy

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Oct 1, 2009
Citations: 34	License type: cc-by

R Discovery Prime

R Discovery Prime

A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

A graph theoretic approach for unsupervised feature selection
Parham Moradi ... Mehrdad Rostami
Engineering Applications of Artificial Intelligence | VOL. 44
Parham Moradi, et. al.Parham Moradi ... Mehrdad Rostami
27 May 2015
Engineering Applications of Artificial Intelligence | VOL. 44

Low-rank dictionary learning for unsupervised feature selection
Mohsen Ghassemi Parsa ... Mehdi Ghatee
Expert Systems with Applications | VOL. 202
Mohsen Ghassemi Parsa, et. al.Mohsen Ghassemi Parsa ... Mehdi Ghatee
11 Apr 2022
Expert Systems with Applications | VOL. 202

Graph regularized virtual label regression for unsupervised feature selection
Chao Sheng ... Peng Song
Digital Signal Processing | VOL. 123
Chao Sheng, et. al.Chao Sheng ... Peng Song
12 Jan 2022
Digital Signal Processing | VOL. 123

Unsupervised feature selection via multi-step markov probability relationship
Yan Min ... Ce Zhu
Neurocomputing | VOL. 453
Yan Min, et. al.Yan Min ... Ce Zhu
24 Apr 2021
Neurocomputing | VOL. 453

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics