Abstract

We are what we do, like, and say. Numerous research efforts have been pushed towards the automatic assessment of personality dimensions relying on a set of information gathered from social media platforms such as list of friends, interests of musics and movies, endorsements and likes an individual has ever performed. Turning this information into signals and giving them as inputs to supervised learning approaches has resulted in being particularly effective and accurate in computing personality traits and types. Despite the demonstrated accuracy of these approaches, the sheer amount of information needed to put in place such a methodology and access restrictions make them unfeasible to be used in a real usage scenario. In this paper, we propose a supervised learning approach to compute personality traits by only relying on what an individual tweets about publicly. The approach segments tweets in tokens, then it learns word vector representations as embeddings that are then used to feed a supervised learner classifier. We demonstrate the effectiveness of the approach by measuring the mean squared error of the learned model using an international benchmark of Facebook status updates. We also test the transfer learning predictive power of this model with an in-house built benchmark created by twenty four panelists who performed a state-of-the-art psychological survey and we observe a good conversion of the model while analyzing their Twitter posts towards the personality traits extracted from the survey.

Highlights

  • Nowadays, social media platforms are the largest mines of personal information, since they continuously record people’s habits, interactions, interests in musics, movies and shopping.Such information regarding individuals is so comprehensive that has become essential for many applications targeting the final customer; industries such as goods retailers and advertising agencies are researching and implementing strategies to acquire this data to better profile their customers and, to customize their offers about products and services

  • To derive the best performing predictive model we explore different machine learning algorithms and configurations and we evaluate them on the training set minimizing the mean squared error used as loss function

  • We compare the learning model used in our approach, namely support vector machine (SVM), with two baseline algorithms, namely Linear Regression [90] and Least absolute shrinkage and selection operator (LASSO) [91], both used in state-of-the-art approaches for personality prediction [1], and both trained with the same configuration setup of our approach

Read more

Summary

Introduction

Social media platforms are the largest mines of personal information, since they continuously record people’s habits, interactions, interests in musics, movies and shopping Such information regarding individuals is so comprehensive that has become essential for many applications targeting the final customer; industries such as (digital) goods retailers and advertising agencies are researching and implementing strategies to acquire this data to better profile their customers and, to customize their offers about products and services. Information 2018, 9, 127 a wide range of personal attributes that people would typically assume to be private and profile the individual along five dimensions, namely Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism The acquisition of such information requires an explicit consent by the individuals resulting in an extreme complexity and making it unfeasible to be applied in real usage scenarios. This is due to the prevention of sensitive information leaks that can potentially harm or discriminate individuals

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call