Abstract

Female Daily Network is a company engaged in social media. Female Daily has social media to share experiences using beauty products called Female Daily. Female Daily has regulations not to use the Female Daily Platform to promote, sell products and services on social media platforms in Female Daily. However, users on Female Daily sometimes violate these rules in their posts and cause other users to be annoyed about it. Admins at Female Daily have difficulty identifying users who violate these rules and ban their posts containing product sales due to the limited number of admins with the number of posts that enter each day. Text mining can also overcome this problem by determining the classification automatically by creating a system that carries out the learning process from the available post words. Algorithms that can be used to carry out the text mining process in this research are Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), and Random Forest (RF). This study uses a combination of feature extraction, contextual features, and data balancing. This study uses research scenarios to analyze feature extraction, contextual feature usage, and data balancing. The best algorithm seen from the recall value in the combination of algorithms and features of this research is the Random Forest TF-IDF Unigram and uses additional contextual features to detect money and selling words with balanced data. The recall value of 88.37% is obtained from the results of the combination of these algorithms and features.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.