Abstract
BackgroundBiological networks provide great potential to understand how cells function. Network motifs, frequent topological patterns, are key structures through which biological networks operate. Finding motifs in biological networks remains to be computationally challenging task as the size of the motif and the underlying network grow. Often, different copies of a given motif topology in a network share nodes or edges. Counting such overlapping copies introduces significant problems in motif identification.ResultsIn this paper, we develop a scalable algorithm for finding network motifs. Unlike most of the existing studies, our algorithm counts independent copies of each motif topology. We introduce a set of small patterns and prove that we can construct any larger pattern by joining those patterns iteratively. By iteratively joining already identified motifs with those patterns, our algorithm avoids (i) constructing topologies which do not exist in the target network (ii) repeatedly counting the frequency of the motifs generated in subsequent iterations. Our experiments on real and synthetic networks demonstrate that our method is significantly faster and more accurate than the existing methods including SUBDUE and FSG.ConclusionsWe conclude that our method for finding network motifs is scalable and computationally feasible for large motif sizes and a broad range of networks with different sizes and densities. We proved that any motif with four or more edges can be constructed as a join of the small patterns.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1271-7) contains supplementary material, which is available to authorized users.
Highlights
Biological networks describe how molecules interact to carry out various cellular functions
This strategy avoids enumerating the overlapping motif instances. It does this by algebraically computing the overlap count based on the neighbors of the motif nodes in the target network. Our experiments on both protein-protein interaction (PPI) and synthetic networks demonstrate that our method is significantly faster and more accurate than the existing methods
We present a case study of the motifs identified by our method on Human herpesvirus PPI network (Section “Case study on Human herpesvirus”)
Summary
Biological networks describe how molecules interact to carry out various cellular functions. Studying biological networks has great potential to help understand how cells function and how they respond to extra-cellular stimulants. Such studies have already been used successfully in many applications. Identifying motifs has been one of the key steps in understanding the functions served by biological networks such as gene regulatory or protein interaction networks [6,7,8]. Frequent topological patterns, are key structures through which biological networks operate. Different copies of a given motif topology in a network share nodes or edges.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.