Abstract

When individuals interact with each other and meaningfully contribute toward a common goal, it results in a collaboration. The artifacts resulting from collaborations are best captured using a hypergraph model, whereas the relation of who has collaborated with whom is best captured via an abstract simplicial complex (SC). We propose a generative algorithm GENESCs for SCs modeling fundamental collaboration relations. The proposed network growth process favors attachment that is preferential not to an individual’s degree, i.e., how many people has he/she collaborated with, but to his/her facet degree, i.e., how many maximal groups or facets has he/she collaborated within. Based on our observation that several real-world facet size distributions have significant deviation from power law–mainly since larger facets tend to subsume smaller ones–we adopt a data-driven approach. We prove that the facet degree distribution yielded by GENESCs is power law distributed for large SCs and show that it is in agreement with real world co-authorship data. Finally, based on our intuition of collaboration formation in domains such as collaborative scientific experiments and movie production, we propose two variants of GENESCs based on clamped and hybrid preferential attachment schemes, and show that they perform well in these domains.

Highlights

  • GENESCs takes as input the distribution of facet sizes and average facet density to generate a collaboration relation using a variant of PA

  • We proposed GENESCs, a generative model for collaboration structures modeled as simplicial complexes (SC)

  • While hypergraphs are good for modeling artifacts of collaborations, e.g., papers and movies, SCs are more appropriate for succinctly modeling the inherent structure of the collaboration relationship, i.e., who all have collaborated with each other

Read more

Summary

Introduction

We demonstrate (and give analytical justification for the fact) that when GENESCs generates facets one after another with their sizes randomly drawn from the facet size distribution of the target real data set given as input, the probability of occurrence of subsumptions during this random growth process is negligible. This does not contradict our observation that subsumption phenomena is common in real collaboration artifact data. This feature of GENESCs is a valuable benefit by virtue of using facets as opposed to hyperedges in the sampling process

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call