Abstract
This research tries to detect mental illness using sentiment analysis on Reddit data, as well as comparing the performance of the k-Nearest Neighbors (k-NN), Random Forest, and Neural Network models. Using text post data from the pre-pandemic and post-pandemic periods, we concluded that the Random Forest model had the highest overall performance with an F1 Score, accuracy, recall and precision of 80.6%, making it quite effective in detecting depression. Even though the Neural Network model shows slightly lower accuracy, namely 79%, in fact this model has the lowest error rate, namely 0.06496. The k-NN model showed the lowest accuracy and higher error rate. These findings highlight the potential of sentiment analysis and machine learning in identifying mental health issues on social media and suggest that better models can improve early detection and intervention efforts.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have