Abstract
Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets.
Highlights
All organisms have many mechanisms, necessary for their survival, that carry on mostly unchanged under all conditions that the organism is subjected to
We, first, identify the variables/genes that uniquely appear in the Gene Regulatory Networks (GRNs) of one study or one group of studies, and derive study-specific gene regulatory networks
0.6 Weighted Gene Correlation Network Analysis (WGCNA) One of the main goals of this paper is to explore techniques that go beyond simple pairwise correlations
Summary
All organisms have many mechanisms, necessary for their survival, that carry on mostly unchanged under all conditions that the organism is subjected to (e.g. cell metabolism). Occur only when some event external or internal to the organism (environmental changes, stress, cancer, etc.) happens and triggers them. Some conditions might trigger similar mechanisms (more or less based on how similar the conditions are) that researchers identify using consensus networks analysis that identifies links in common over a number of studies [1]. Highlighting the similarities, though, can overshadow or even hide what is unique and typical to one specific condition. Biologists are clearly interested in what these similarities are but they are interested in identifying the condition-specific mechanisms/genepaths of which knowledge will help in their detailed understanding. The novelty of our approach is the ability to semiautomatically identify subnetworks that are unique to a number of independent studies (unique networks). Identification of unique networks could lead to a better understanding of those mechanisms
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have