Abstract

Scalable link prediction in social networks allow dynamic social interaction gathering, potential friend suggestions, and community detection. Distributed open-source frameworks such as Hadoop and Spark facilitate efficient link prediction especially in large-scale social networks. The frameworks provide different kinds of tunable properties for users to manually configure the parameters for the applications. However, manual configurations are open to performance issues when the applications start scaling tremendously, which are hard to set up and are exposed to human errors. This paper proposes a novel Self-Configured Framework (SCF) to provide an autonomous feature in Spark that predicts and sets the best configuration instantly before the application execution using the XGBoost classifier. The framework with a self-configuration setting demonstrates a 40% reduction in prediction time as well as a balanced resource consumption that makes full use of resources, especially for limited number and size of clusters. The presented framework establishes its efficiency for link prediction in large-scale social networks by automatically configuring the best configuration suitable for a specific application given the varying dataset size of the Twitter social network, workload, and cluster specification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call