Abstract

Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

Highlights

  • IntroductionComplex diseases involve multiple intertwined signaling circuitries. Cancer is an excellent example with a number of biological machineries activated in tumor pathogenesis including proliferation, angiogenesis, avoidance of cell death, evasion of tumor suppressing mechanisms, immortality, invasion etc[1]

  • Often, complex diseases involve multiple intertwined signaling circuitries

  • We developed a novel co-expression network analysis framework named Multiscale Embedded Gene co-Expression Network Analysis (MEGENA) that can effectively and efficiently construct and analyze large scale planar filtered co-expression networks

Read more

Summary

Introduction

Complex diseases involve multiple intertwined signaling circuitries. Cancer is an excellent example with a number of biological machineries activated in tumor pathogenesis including proliferation, angiogenesis, avoidance of cell death, evasion of tumor suppressing mechanisms, immortality, invasion etc[1]. Networks of these intertwined signaling cascades, such as protein-protein interaction networks and metabolic networks are highly heterogeneous[3,4,5] These networks share certain characteristics such as the scale-free property (the degree distribution follows a power law), small world effect (diameter of network scales with logarithm/double-logarithm of the number of nodes)[3, 5], assortativity (preference for a network’s nodes to attach to others that are similar in some ways, i.e., high degree nodes tend to attach to high/low degree nodes)[6], and community structures [7, 8]. These observations suggest that the biological networks may follow the similar evolutionary dynamics, and network analysis approaches from other domains are very helpful for understanding biological networks[4]

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.