Machine Learning-based Depression Prediction using Social Media Feeds

M Keerthiga,D Abisha,S Shenbagalakshmi,P Kalaiselvi

doi:10.1109/icict57646.2023.10134427

Abstract

In today's environment, young people frequently use social media platforms to communicate emotions. They post about their feelings on social media, which can help us understand how they feel at the time. As a reaction to the critical need for early detection tools, this research study uses sentiment analysis techniques to examine user contributions to social networks to help detect potential depression at an early stage. The research describes different methods for predicting sadness from user posts. The dataset is vectorised using count vectoriser and TF-IDFvectorizer, and features like post sentiment is retrieved. In our project, the model is divided into training and test datasets and trained using the Naive Bayes, Support Vector Machine, Decision Trees, Random Forest, and K-Nearest Neighbors machine learning techniques. The measures that are assessed are recall and accuracy. The Instagram API is applied to mine Instagram posts to create the dataset for the model. Each comment will undergo pre processing; each word will be processed through a lexicon to determine if it is positive or negative. This research study presents a new feature vector for classifying the texts as positive or negative. Each comment generates a score value from the lexicon to signify the degree of positivity, negativity, and other factors. A CSV file containing around 6,300 posts has been preprocessed. The distinctive characters and extraneous characters are eliminated using regular expressions. The data quality is then enhanced using stop words, Lemmatization, and tokenization. The best method for this approach yields an accuracy of 90.19% and a recall of 89.85% utilizing a decision tree model using a count vectorizer.

Full Text