Abstract

Dimensionality reduction is often used to visualize complex expression profiling data. Here, we use the Uniform Manifold Approximation and Projection (UMAP) method on published transcript profiles of 1484 single gene deletions of Saccharomyces cerevisiae. Proximity in low-dimensional UMAP space identifies groups of genes that correspond to protein complexes and pathways, and finds novel protein interactions, even within well-characterized complexes. This approach is more sensitive than previous methods and should be broadly useful as additional transcriptome datasets become available for other organisms.

Highlights

  • Dimensionality reduction is often used to visualize complex expression profiling data

  • Because Uniform Manifold Approximation and Projection (UMAP) is better able to preserve elements of the data structure from high-dimensional space than similar outputs from t-Distributed Stochastic Neighbor Embedding (t-SNE), it captures local relationships within groups of transcriptomes in addition to global relationships between distinct groups[14]. This feature is especially useful in the inference of gene relationships, which can be due to physical interaction, overlapping gene function, or coordinated contributions to a larger cellular process

  • We show that the use of dimensionality reduction by UMAP on bulk expression profiling data of 1484 single-gene mutants of S. cerevisiae links gene function in clusters at increasingly finer scales, corresponding to broad cellular activities, pathways, protein complexes and individual protein-protein interactions

Read more

Summary

Introduction

Dimensionality reduction is often used to visualize complex expression profiling data. Dimensionality reduction methods capture variability in a limited number of random variables to facilitate 2- or 3D-visualization of datasets with tens to thousands of dimensions This approach is recognizable in the commonly used method of principal component analysis (PCA), which uses linear combinations of variables to generate orthogonal axes that efficiently capture the variation present in the data with fewer variables. Because UMAP is better able to preserve elements of the data structure from high-dimensional space than similar outputs from t-SNE, it captures local relationships within groups of transcriptomes in addition to global relationships between distinct groups[14] This feature is especially useful in the inference of gene relationships, which can be due to physical interaction, overlapping gene function, or coordinated contributions to a larger cellular process. We show that the use of dimensionality reduction by UMAP on bulk expression profiling data of 1484 single-gene mutants of S. cerevisiae links gene function in clusters at increasingly finer scales, corresponding to broad cellular activities, pathways, protein complexes and individual protein-protein interactions

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.