Data-driven modeling of collaboration networks: a cross-domain analysis

Mario V Tomasello,Frank Schweitzer,Giacomo Vaccario

doi:10.1140/epjds/s13688-017-0117-5

Mario V Tomasello, Frank Schweitzer + Show 1 more

Open Access

https://doi.org/10.1140/epjds/s13688-017-0117-5

Copy DOI

Journal: EPJ Data Science	Publication Date: Sep 6, 2017
Citations: 19	License type: open-access

Affiliation: Ernst Basler + Partner (Switzerland), ETH Zurich

Abstract

We analyze large-scale data sets about collaborations from two different domains: economics, specifically 22,000 R&D alliances between 14,500 firms, and science, specifically 300,000 co-authorship relations between 95,000 scientists. Considering the different domains of the data sets, we address two questions: (a) to what extent do the collaboration networks reconstructed from the data share common structural features, and (b) can their structure be reproduced by the same agent-based model. In our data-driven modeling approach we use aggregated network data to calibrate the probabilities at which agents establish collaborations with either newcomers or established agents. The model is then validated by its ability to reproduce network features not used for calibration, including distributions of degrees, path lengths, local clustering coefficients and sizes of disconnected components. Emphasis is put on comparing domains, but also sub-domains (economic sectors, scientific specializations). Interpreting the link probabilities as strategies for link formation, we find that in R&D collaborations newcomers prefer links with established agents, while in co-authorship relations newcomers prefer links with other newcomers. Our results shed new light on the long-standing question about the role of endogenous and exogenous factors (i.e., different information available to the initiator of a collaboration) in network formation.

Highlights

The availability of large-scale and time resolved data sets about economic, scientific or social activities opens new venues to address the long standing question of how we collaborate
We aim at an agent-based model that includes a minimalistic set of microscopic rules. We argue that this agent-based model is correct if it is able to reproduce a specific set of macroscopic properties of the different collaboration networks, namely degree distribution, path length distribution, distribution of community sizes, that are not used for the calibration of the model
We report our findings about the path length between agents before they engage in a collaboration in Figure for the ‘Pharmaceuticals’ Research and Development (R&D) network, and in Figure for the coauthorship network in interdisciplinary physics

Summary

Introduction

The availability of large-scale and time resolved data sets about economic, scientific or social activities opens new venues to address the long standing question of how we collaborate. This question becomes more important as globalization leads to a vast increase of collaborations in many areas of human activity, including science and economics [ – ]. One could argue that collaboration patterns change with respect to the actors and the domain of activity, but there may be evidence for common features across different domains In the latter case, we could hypothesize that a unified modeling approach should be able to reproduce, and to explain, the structural and the dynamic features of collaborations in different domains. We provide a new flexible model that allows to understand collaboration patterns

Objectives

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data-driven modeling of collaboration networks: a cross-domain analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Data Science

Lead the way for us

Similar Papers

THE ROLE OF NETWORK EMBEDDEDNESS ON THE SELECTION OF COLLABORATION PARTNERS: AN AGENT-BASED MODEL WITH EMPIRICAL VALIDATION
Frank Schweitzer ... Mario V Tomasello
Advances in Complex Systems | VOL. 25
Frank Schweitzer, et. al.Frank Schweitzer ... Mario V Tomasello
01 Mar 2022
Advances in Complex Systems | VOL. 25

Data-driven modelling of signal-transduction networks
Kevin A Janes ... Michael B Yaffe
Nature Reviews Molecular Cell Biology | VOL. 7
Kevin A Janes, et. al.Kevin A Janes ... Michael B Yaffe
01 Nov 2006
Nature Reviews Molecular Cell Biology | VOL. 7

Reproducing Scientists’ Mobility: A Data-Driven Model
Giacomo Vaccario ... Frank Schweitzer
SSRN Electronic Journal | VOL. -
Giacomo Vaccario, et. al.Giacomo Vaccario ... Frank Schweitzer
01 Jan 2018
SSRN Electronic Journal | VOL. -

Estimating anthropogenic subsoil compaction in Germany using data-driven reciprocal modelling
Laura Sofie Harbo ... Florian Schneider
-
Laura Sofie Harbo, et. al.Laura Sofie Harbo ... Florian Schneider
08 Mar 2024
08 Mar 2024

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data-driven modeling of collaboration networks: a cross-domain analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Data Science