Abstract

The identification of motifs--subgraphs that appear significantly more often in a particular network than in an ensemble of randomized networks--has become a ubiquitous method for uncovering potentially important subunits within networks drawn from a wide variety of fields. We find that the most common algorithms used to generate the ensemble from the real network change subgraph counts in a highly correlated manner, such that one subgraph's status as a motif may not be independent from the statuses of the other subgraphs. We demonstrate this effect for the problem of three- and four-node motif identification in the transcriptional regulatory networks of E. coli and S. cerevisiae in which randomized networks are generated via an edge-swapping algorithm. We find strong correlations among subgraph counts; for three-node subgraphs these correlations are easily interpreted, and we present an information-theoretic tool that may be used to identify correlations among subgraphs of any size. Our results suggest that single-feature statistics such as Z scores that implicitly assume independence among subgraph counts constitute an insufficient summary of the network.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.