Abstract

Myers-Briggs Personality Type (MBTI) is a popular personality metric that uses four dichotomies as indicators of personality traits. This study uses a public dataset from Kaggle, namely the Myers-Briggs Personality Type Dataset, the model tested is several machine learning classification models with the help of imlearn under-over sampling techniques for classifying MBTI personality types. This study aims to classify the Myers-Briggs Type Indicator (MBTI) personality type based on text from user posts on the social media platform Reddit. The dataset used in this study consists of around 8,000 posts collected from the MBTI subreddit. Several text processing methods such as tokenization, punctuation removal, and stemming are used to process the raw data before it is entered into the model. The experimental results show that the LSTM model using Adam's optimizer and a learning rate of 0.01 produces good performance with an accuracy of 80.73 compared to other machine learning models. In addition to the LSTM model, XG Boost is also a classification model with the highest accuracy based on 16 personality types producing an accuracy of 60.09 and Logistic Regression with the NS dimension as the best accuracy value of 87.21%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.