Abstract

BackgroundGene expression microarray data have been organized and made available as public databases, but the utilization of such highly heterogeneous reference datasets in the interpretation of data from individual test samples is not as developed as e.g. in the field of nucleotide sequence comparisons. We have created a rapid and powerful approach for the alignment of microarray gene expression profiles (AGEP) from test samples with those contained in a large annotated public reference database and demonstrate here how this can facilitate interpretation of microarray data from individual samples.MethodsAGEP is based on the calculation of kernel density distributions for the levels of expression of each gene in each reference tissue type and provides a quantitation of the similarity between the test sample and the reference tissue types as well as the identity of the typical and atypical genes in each comparison. As a reference database, we used 1654 samples from 44 normal tissues (extracted from the Genesapiens database).ResultsUsing leave-one-out validation, AGEP correctly defined the tissue of origin for 1521 (93.6%) of all the 1654 samples in the original database. Independent validation of 195 external normal tissue samples resulted in 87% accuracy for the exact tissue type and 97% accuracy with related tissue types. AGEP analysis of 10 Duchenne muscular dystrophy (DMD) samples provided quantitative description of the key pathogenetic events, such as the extent of inflammation, in individual samples and pinpointed tissue-specific genes whose expression changed (SAMD4A) in DMD. AGEP analysis of microarray data from adipocytic differentiation of mesenchymal stem cells and from normal myeloid cell types and leukemias provided quantitative characterization of the transcriptomic changes during normal and abnormal cell differentiation.ConclusionsThe AGEP method is a widely applicable method for the rapid comprehensive interpretation of microarray data, as proven here by the definition of tissue- and disease-specific changes in gene expression as well as during cellular differentiation. The capability to quantitatively compare data from individual samples against a large-scale annotated reference database represents a widely applicable paradigm for the analysis of all types of high-throughput data. AGEP enables systematic and quantitative comparison of gene expression data from test samples against a comprehensive collection of different cell/tissue types previously studied by the entire research community.

Highlights

  • Gene expression microarray data have been organized and made available as public databases, but the utilization of such highly heterogeneous reference datasets in the interpretation of data from individual test samples is not as developed as e.g. in the field of nucleotide sequence comparisons

  • The alignment of microarray gene expression profiles (AGEP) method is based on the use of kernel density estimates for the expression levels of genes across each of the reference sample types

  • Application of the array alignment for the microarray data analysis II: stem cell differentiation We explored the AGEP method in the analysis and interpretation of transcriptional changes from a study of differentiating mesenchymal stem cells to adipocytes with three replicate samples measured over 5 time points (0 h, 1 h, 3 h, 9 h and 7d)

Read more

Summary

Introduction

Gene expression microarray data have been organized and made available as public databases, but the utilization of such highly heterogeneous reference datasets in the interpretation of data from individual test samples is not as developed as e.g. in the field of nucleotide sequence comparisons. Gene expression microarray data published by the entire biomedical community have been organized and made available for data mining in several public databases (e.g. Oncomine, Gene Expression Omnibus, Array-express, GeneSapiens) [1,2,3,4,5,6,7] This has facilitated analyses of gene networks and gene regulatory processes [8,9,10,11,12], and the identification of tissue- or disease-specific gene expression patterns [13,14,15,16,17,18,19]. We describe the AGEP method and validate its utility in the analysis of microarray data from normal and disease tissue types as well as the quantitative analysis of cell differentiation patterns

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call