The rapid development of communication technology has led to an increase in the number of unwanted messages, such as spam and phishing attempts. However, this progress has not been accompanied by sufficient user awareness of the basics of technology use. Additionally, the enforcement of laws regarding internet-based crimes remains unclear, further increasing the risk for users of internet technology to fall victim to such crimes. As one of the media prone to spam and phishing, WhatsApp is the focus of this research, which aims to develop an application capable of filtering spam and phishing messages. The application employs the TF-IDF (Term Frequency-Inverse Document Frequency) method and machine learning using the Random Forest model. It is developed using the MVVM (Model-View-ViewModel) architecture, enabling the separation of business logic from the user interface, thereby improving development and maintenance efficiency. The research findings demonstrate that the combination of TF-IDF and Random Forest achieves high accuracy in classifying spam and phishing messages. Performance evaluation using a confusion matrix reveals an accuracy rate of 92%. For the safe message class, the precision, recall, and F1 scores are 89%, 95%, and 92%, respectively, while for the dangerous message class, the scores are 95%, 88%, and 92%, respectively. Furthermore, the integration of the model and application performed exceptionally well, as evidenced by black-box testing results. All test scenarios were met, successfully detecting test messages with 98% accuracy. Therefore, the developed application provides enhanced protection for WhatsApp users against digital threats.
Read full abstract