Abstract

Cross-platform or cross-experiment transcriptome data is hard to compare as the original gene expression values from different platforms cannot be compared directly. The inherent gene expression ranking information is rarely utilized. Use of reduced vector to represent transcriptome data independent of platforms. Thus, we turned the expression profile into a rank vector, where a higher expression has a higher rank value, then applied Latent semantic analysis (LSA) to get compact and continuous 100-dimensional vector representations for samples. Results showed that the reconstructed vector has a precision of 96.7% in recovering tissue labels from an independent dataset. A user-friendly tool TissueSpace was developed, which provides users the following functionalities: (1) convert different gene ID types to Ensembl gene IDs; (2) project any human transcriptome profile to get vector representation for downstream analysis; (3) functional enrichment for each of the 100-dimensional vector features. Case studies for its applications in human common diseases indicate its usefulness. TissueSpace could be used to generate testable hypotheses for translational medicine. The TissueSpace tool is available at http://bioinformatics.fafu.edu.cn/tissuespace/ .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call