Abstract

BackgroundAs a novel cancer diagnostic paradigm, mass spectroscopic serum proteomic pattern diagnostics was reported superior to the conventional serologic cancer biomarkers. However, its clinical use is not fully validated yet. An important factor to prevent this young technology to become a mainstream cancer diagnostic paradigm is that robustly identifying cancer molecular patterns from high-dimensional protein expression data is still a challenge in machine learning and oncology research. As a well-established dimension reduction technique, PCA is widely integrated in pattern recognition analysis to discover cancer molecular patterns. However, its global feature selection mechanism prevents it from capturing local features. This may lead to difficulty in achieving high-performance proteomic pattern discovery, because only features interpreting global data behavior are used to train a learning machine.MethodsIn this study, we develop a nonnegative principal component analysis algorithm and present a nonnegative principal component analysis based support vector machine algorithm with sparse coding to conduct a high-performance proteomic pattern classification. Moreover, we also propose a nonnegative principal component analysis based filter-wrapper biomarker capturing algorithm for mass spectral serum profiles.ResultsWe demonstrate the superiority of the proposed algorithm by comparison with six peer algorithms on four benchmark datasets. Moreover, we illustrate that nonnegative principal component analysis can be effectively used to capture meaningful biomarkers.ConclusionOur analysis suggests that nonnegative principal component analysis effectively conduct local feature selection for mass spectral profiles and contribute to improving sensitivities and specificities in the following classification, and meaningful biomarker discovery.

Highlights

  • As a novel cancer diagnostic paradigm, mass spectroscopic serum proteomic pattern diagnostics was reported superior to the conventional serologic cancer biomarkers

  • We present a nonnegative principal component analysis (NPCA) algorithm and propose a nonnegative principal component analysis based support vector machine algorithm (NPCA-SVM) for high-performance proteomic pattern discovery

  • Discussion nonnegative principal component analysis has overcome the global nature of the standard PCA algorithm, and contributed to the high-performance proteomic pattern prediction and effective biomarker capture, it is an expensive algorithm with a high complexity O(d2n × N) compared to the classic PCA algorithm O(d3) for an input data X Œ Rd×n, d ≪ n

Read more

Summary

Introduction

As a novel cancer diagnostic paradigm, mass spectroscopic serum proteomic pattern diagnostics was reported superior to the conventional serologic cancer biomarkers. An important factor to prevent this young technology to become a mainstream cancer diagnostic paradigm is that robustly identifying cancer molecular patterns from high-dimensional protein expression data is still a challenge in machine learning and oncology research. Its global feature selection mechanism prevents it from capturing local features This may lead to difficulty in achieving high-performance proteomic pattern discovery, because only features interpreting global data behavior are used to train a learning machine. With the rapid advances in proteomics, mass spectroscopic serum proteomic pattern diagnostics has been appearing as a revolutionary cancer diagnostic paradigm. This technology still remains as an important field in clinical research study rather than a clinical routine testing [1]. There are a large number of m/z ratios in a mass spectral profile, only a small number of them have meaningful contributions to the data variations

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call