Abstract

A common goal in biological sciences is to model a complex web of connections using a small number of interacting units. We present a general approach for dividing up elements in a spatial map based on their connectivity properties, allowing for the discovery of local regions underlying large-scale connectivity matrices. Our method is specifically designed to respect spatial layout and identify locally-connected clusters, corresponding to plausible coherent units such as strings of adjacent DNA base pairs, subregions of the brain, animal communities, or geographic ecosystems. Instead of using approximate greedy clustering, our nonparametric Bayesian model infers a precise parcellation using collapsed Gibbs sampling. We utilize an infinite clustering prior that intrinsically incorporates spatial constraints, allowing the model to search directly in the space of spatially-coherent parcellations. After showing results on synthetic datasets, we apply our method to both functional and structural connectivity data from the human brain. We find that our parcellation is substantially more effective than previous approaches at summarizing the brain’s connectivity structure using a small number of clusters, produces better generalization to individual subject data, and reveals functional parcels related to known retinotopic maps in visual cortex. Additionally, we demonstrate the generality of our method by applying the same model to human migration data within the United States. This analysis reveals that migration behavior is generally influenced by state borders, but also identifies regional communities which cut across state lines. Our parcellation approach has a wide range of potential applications in understanding the spatial structure of complex biological networks.

Highlights

  • When studying biological systems at any scale, scientists are often interested in the properties of individual molecules, cells, or organisms, and in the web of connections between these units

  • In this paper we present the first general solution to this problem, introducing a new generative probabilistic model to parcellate a spatial map into local regions with connectivity properties that are as uniform as possible

  • Analogous to the approach taken in stochastic block modeling (Aicher, Jacobs & Clauset, 2014), we model the connectivity between each pair of parcels as a separate distribution with latent parameters

Read more

Summary

Introduction

When studying biological systems at any scale, scientists are often interested in the properties of individual molecules, cells, or organisms, and in the web of connections between these units. The rise of massive biological datasets has enabled us to measure these second-order interactions more accurately, in domains ranging from protein–protein interactions, to neural networks, to ecosystem food webs. 2004; Hartwell et al, 1999), including protein–protein interactions (Rives & Galitski, 2003), metabolic networks (Ravasz et al, 2002), bacterial co-occurrence (Freilich et al, 2010), pollination networks (Olesen et al, 2007), and food webs (Krause, Frank & Mason, 2003). There are a large number of methods for clustering connectivity data, such as k-means (Kim et al, 2010; Golland et al, 2008; Lee et al, 2012), Gaussian mixture modeling (Golland, Golland & Malach, 2007), hierarchical clustering (Mumford et al, 2010; Cordes et al, 2002; Gorbach et al, 2011), normalized cut (Van den Heuvel, Mandl & Hulshoff Pol, 2008), infinite relational modeling (Morup et al, 2010), force-directed graph layout (Crippa et al, 2011), weighted stochastic block modeling (Aicher, Jacobs & Clauset, 2014), and self-organized mapping (Mishra et al, 2014; Wiggins et al, 2011)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call