Abstract

Diffusion processes in social networks often cause the emergence of global phenomena from individual behavior within a society. The study of those global phenomena and the simulation of those diffusion processes frequently require a good model of the global network. However, survey data and data from online sources are often restricted to single social groups or features, such as age groups, single schools, companies, or interest groups. Hence, a modeling approach is required that extrapolates the locally restricted data to a global network model. We tackle this Missing Data Problem using Link-Prediction techniques from social network research, network generation techniques from the area of Social Simulation, as well as a combination of both. We found that techniques employing less information may be more adequate to solve this problem, especially when data granularity is an issue. We validated the network models created with our techniques on a number of real-world networks, investigating degree distributions as well as the likelihood of links given the geographical distance between two nodes.

Highlights

  • Social networks shape our lives [1]

  • In this paper we investigate the performance of techniques for network generation [9] and link-prediction [10], as well as a combination of both, in filling up the informational gap that frequently occurs between isolated components in social network surveys

  • Missing information may be classified as missing completely at random (MCAR) if the missing value does neither depend on other missing values, nor on observable values, missing at random (MAR) if the missing value does not depend on other missing values and missing not at random (MNAR) when the reason for the missing information can be found in the information itself [11]

Read more

Summary

Introduction

They play a major role in the diffusion of ideas, norms, information, behaviors or viruses [1,2,3,4] This motivates a wide range of research concerning social networks, frequently triggered by the emergence of online social networks and large data sets containing relational data. When it comes to close personal relations such as affective contacts or close friendships and especially when detailed behavioral data is to be collected, surveys of individuals are still required. To the best of our knowledge it has not been studied yet how systematically missing data between isolated components from social network surveys (MNAR) may be inferred (or imputed) in order to enable simulations on a global network model. Traditional solutions to the missing data problems “Fixed Choice Effect” and “Boundary Specification Problem” are applied in survey planning, dealing with the careful definition of the survey group [5, 6, 12]

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call