Abstract

BackgroundThe recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i.e. not well known) problems.ResultsIn this study, PCA based unsupervised FE was applied to an extensively studied organism, i.e., budding yeast. When applied to two gene expression profiles expected to be temporally periodic, yeast metabolic cycle (YMC) and yeast cell division cycle (YCDC), PCA based unsupervised FE outperformed simple but powerful conventional methods, with sinusoidal fitting with regards to several aspects: (i) feasible biological term enrichment without assuming periodicity for YMC; (ii) identification of periodic profiles whose period was half as long as the cell division cycle for YMC; and (iii) the identification of no more than 37 genes associated with the enrichment of biological terms related to cell division cycle for the integrated analysis of seven YCDC profiles, for which sinusoidal fittings failed. The explantation for differences between methods used and the necessary conditions required were determined by comparing PCA based unsupervised FE with fittings to various periodic (artificial, thus pre-defined) profiles. Furthermore, four popular unsupervised clustering algorithms applied to YMC were not as successful as PCA based unsupervised FE.ConclusionsPCA based unsupervised FE is a useful and effective unsupervised method to investigate YMC and YCDC. This study identified why the unsupervised method without pre-judged criteria outperformed supervised methods requiring human defined criteria.Electronic supplementary materialThe online version of this article (doi:10.1186/s13040-016-0101-9) contains supplementary material, which is available to authorized users.

Highlights

  • The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles

  • PCA based unsupervised FE successfully identified stable sets composed of limited numbers of circulating microRNA that discriminated between multiple diseases, genes associated with aberrant promoter methylation commonly found among three distinct autoimmune diseases by integrating promoter methylation profiles from three distinct autoimmune diseases, and candidate disease-causing genes ranging from cancers to neurodegenerative diseases by integrating distinct expression profiles

  • PCA based unsupervised FE applied to yeast metabolic cycle PCA based unsupervised FE was applied to temporal gene expression observed during YMC [13]

Read more

Summary

Introduction

The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. The use of this methodology is not widely supported, possibly because no criteria regarding its successful use and the mechanisms involved in how it outperforms other methods have been reported This lack of knowledge is because PCA based unsupervised FE was previously applied to challenging problems that other conventional methods cannot deal with to demonstrate superiority to existing methods. Without a comparison of results, the reasons why PCA based unsupervised FE can outperform other conventional methods cannot be determined

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call