Abstract

BackgroundMouse embryonic stem cells (mESCs) are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro. Their distinct features are their ability to self-renew and to differentiate to all adult cell types. Genes that maintain mESCs self-renewal and pluripotency identity are of interest to stem cell biologists. Although significant steps have been made toward the identification and characterization of such genes, the list is still incomplete and controversial. For example, the overlap among candidate self-renewal and pluripotency genes across different RNAi screens is surprisingly small. Meanwhile, machine learning approaches have been used to analyze multi-dimensional experimental data and integrate results from many studies, yet they have not been applied to specifically tackle the task of predicting and classifying self-renewal and pluripotency gene membership.ResultsFor this study we developed a classifier, a supervised machine learning framework for predicting self-renewal and pluripotency mESCs stemness membership genes (MSMG) using support vector machines (SVM). The data used to train the classifier was derived from mESCs-related studies using mRNA microarrays, measuring gene expression in various stages of early differentiation, as well as ChIP-seq studies applied to mESCs profiling genome-wide binding of key transcription factors, such as Nanog, Oct4, and Sox2, to the regulatory regions of other genes. Comparison to other classification methods using the leave-one-out cross-validation method was employed to evaluate the accuracy and generality of the classification. Finally, two sets of candidate genes from genome-wide RNA interference screens are used to test the generality and potential application of the classifier.ConclusionsOur results reveal that an SVM approach can be useful for prioritizing genes for functional validation experiments and complement the analyses of high-throughput profiling experimental data in stem cell research.

Highlights

  • Mouse embryonic stem cells are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro

  • Learning from heterogeneous data types We extracted 91 features/attributes from mRNA gene expression and ChIP-seq experiments for each gene from Mouse embryonic stem cells (mESCs)-related studies. 79 features/ attributes were created from mRNA expression microarray profiling data extracted from the Gene Expression Omnibus (GEO) database [12] references to the files are provided in the methods and Additional files 1, 2 and 3

  • We implemented two types of preprocessing approaches for generating features/attributes from the ChIP-seq datasets: With the first approach, we converted the results from the ChIP-seq experiments into Boolean values where zero represents absence and one represents presence of binding sites in proximity to a gene detected as a peak in a ChIP-seq experiment

Read more

Summary

Introduction

Mouse embryonic stem cells (mESCs) are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro Their distinct features are their ability to self-renew and to differentiate to all adult cell types. Machine learning approaches have been used to analyze multi-dimensional experimental data and integrate results from many studies, yet they have not been applied to tackle the task of predicting and classifying self-renewal and pluripotency gene membership. Their distinct features are their ability to self-renewal as well as to differentiate into all adult cell types including the germ-line. We attempted to use this approach to tackle the task of predicting MSMGs utilizing two types of high-throughput data by combining several independent studies

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.