Abstract

Personality provides a deep insight of someone and has an important part in someone’s job performance. Predicting personality through social media has been studied on several research. The problem is how to improve the performance of personality prediction system. The purpose of this research is to predict personality on Twitter users and increase the performance of the personality prediction system. An online survey using Big Five Inventory (BFI) questionnaire has been distributed and gathered 295 Twitter users with 511,617 tweets data. In this research, we experiment on two different methods using Support Vector Machine (SVM), and the combination of SVM and BERT as the semantic approach. This research also implements Linguistic Inquiry Word Count (LIWC) as the linguistic feature for personality prediction system. The results showed that combination of these two methods achieve 79.35% accuracy score and with the implementation of LIWC can improve the accuracy score up to 80.07%. Overall, these results showed that the combination of SVM and BERT as the semantic approach with the implementation of LIWC is recommended to gain a better performance for the personality prediction system.

Highlights

  • Personality provides a deep insight of someone and has an important part in someone’s job performance

  • Similar study conducted by [7], implemented Decision Tree C4.5 method and TF-RF and TF-CHI2 as the linguistic approach for personality prediction and achieved 65.72% accuracy score on 145 Twitter users with total of tweets data as many as 331.439 tweets data

  • The performance results from that research showed that the optimal accuracy they achieved was equal to 61.6% with CNN method with Linguistic Inquiry Word Count (LIWC) as the linguistic

Read more

Summary

Introduction

Personality provides a deep insight of someone and has an important part in someone’s job performance. Similar study conducted by [7], implemented Decision Tree C4.5 method and TF-RF and TF-CHI2 as the linguistic approach for personality prediction and achieved 65.72% accuracy score on 145 Twitter users with total of tweets data as many as 331.439 tweets data. The authors of this research stated that the low accuracy score is because the inequality of the data, the model tends to predict only on the dominant class of the data Another experiment was conducted to compare different methods such as SVM, BLR, MNB, and CNN on 250 users with 9900 text data [8]. Several studies have been fulfilled to identify social approach Their optimal performance result was media users' personalities based on texts or tweets in because they extracted LIWC features into CNN their account. Research usually implements Linguistic Inquiry Word The dataset size in this research is larger than previous

Objectives
Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.