MD-MBPLS: A novel explanatory model in computational social science

Shan Lu,Jichang Zhao,Huiwen Wang

doi:10.1016/j.knosys.2021.107023

Abstract

The newly emerged field of computational social science has witnessed unprecedented opportunities in the past decade with the explosion of data collection. However, these data are of different types, usually consisting of constant attributes (scalar data), temporal dynamics at the individual level (functional data) and group summaries at the collective level (compositional data). Consequently, existing models are substantially challenged by difficulties encountered in data aggregation. Moreover, previous studies excessively focus on the predictive power rather than interpretative power of models, while the latter is genuinely more crucial for developing profound theories and disclosing in-depth mechanisms in social science. This paper proposes a novel model named mixed data multi-block partial least squares (MD-MBPLS) to overcome these difficulties. Four real-world datasets are considered to evaluate the model. These datasets arise from large-scale collections of smart-card sensors, high-frequency trading systems and online social media, representing trending topics of student campus behaviour, stock market volatility, fake news circulation and personality in tweeting. The empirical results congruously reveal the prominent advantages of the model with respect to data fusion and explanatory power. Specifically, interesting patterns are reliably revealed which supplement new insights and help decision makers to obtain guidelines for interventions, such as student academic performance enhancement or precautions against fake news propagation. Our model will enhance the interpretative ability of computational social science studies by fully leveraging mixed types of data.

Full Text