Abstract
BackgroundModular structures are ubiquitous across various types of biological networks. The study of network modularity can help reveal regulatory mechanisms in systems biology, evolutionary biology and developmental biology. Identifying putative modular latent structures from high-throughput data using exploratory analysis can help better interpret the data and generate new hypotheses. Unsupervised learning methods designed for global dimension reduction or clustering fall short of identifying modules with factors acting in linear combinations.ResultsWe present an exploratory data analysis method named MLSA (Modular Latent Structure Analysis) to estimate modular latent structures, which can find co-regulative modules that involve non-coexpressive genes.ConclusionsThrough simulations and real-data analyses, we show that the method can recover modular latent structures effectively. In addition, the method also performed very well on data generated from sparse global latent factor models. The R code is available at http://userwww.service.emory.edu/~tyu8/MLSA/.
Highlights
Modular structures are ubiquitous across various types of biological networks
The most common modular structures are co-regulated genes by common transcription factors (TFs) [2,3,4], proteins that interact with common hub proteins [5,6], and metabolites in the same metabolic pathway [7]
These studies suggested that the transcription levels of the TFs themselves generally do not reflect the activity levels, which argues for the usage of latent variable models
Summary
Modular structures are ubiquitous across various types of biological networks. The study of network modularity can help reveal regulatory mechanisms in systems biology, evolutionary biology and developmental biology. The most common modular structures are co-regulated genes by common transcription factors (TFs) [2,3,4], proteins that interact with common hub proteins [5,6], and metabolites in the same metabolic pathway [7] Unsupervised learning methods, such as methods for dimension reduction and clustering, are used to find underlying data structures [8,9], and generate lower-dimensional data for downstream analysis [10,11,12]. The non-zero loadings should form blocks, with every block corresponding to one module
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have