Abstract

Many social networks exhibit community structure, where individuals form discrete subgroups. The composition of such groupings is important for numerous research directions, but their characterization is challenged by data sampling issues. In wild populations, where individuals range over large distances and observation can be limited, social data required to resolve community structure are difficult to collect. Recent studies used simulated data sets to determine the robustness of individual level network metrics under suboptimal sampling conditions, but the sensitivity of community detection algorithms to imperfect sampling has not been assessed. Here, we used simulated data sets to determine how sampling effort and skew influence the ability of three community detection algorithms (fastgreedy, walktrap and louvain) to recover the ‘true’ community structure of networks under two sampling regimes (field-based observational sampling and sampling through biologgers, e.g. proximity detectors). We also examined the robustness of a measure of uncertainty in estimated community structure ( r com ). We based our simulated societies on contact patterns in wild male African elephants, a model system reflecting common sampling challenges of large wild populations. Our results indicate that the accuracy of the algorithms improved with increasing sampling effort and decreasing sampling skew. Under the field sampling regime, when sampling effort is constrained, mid-levels of sampling skew may provide a reasonable compromise between maximizing the mean numbers of observation per individual and minimizing sampling skew. Even with skewed data, r com can provide a reliable measure of uncertainty in the estimated community assignments, but it should be interpreted cautiously with highly skewed data. The network structures explored represent common sampling challenges for wild populations, but unexplored sampling regimes may drive somewhat different dynamics. Our simulations indicate that adequate sampling even when skewed can be informative and maximization of the number of observations among all individuals should be a general objective. • Useful information about community structure can be obtained with skewed sampling. • Accurate community assignment requires many observations of each individual. • Moderately skewed sampling can improve results while reducing sampling requirements. • r com is a robust measure of uncertainty even when sampling is skewed. • Results from biologgers are robust with as few as 30% of the population tagged.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call