Machine Learning Framework for Women Safety Prediction using Decision Tree

P Sareena Sowmika,Shaik Rafi,S.Siva Nageswara Rao

doi:10.1109/icssit55814.2023.10060997

Abstract

In every city, harassment and violence becomes one of the major problems for women. Further, women’s personal life is suffered by the bullying and abusive content presented in Online Social Networking (OSN). Therefore, it is necessary to identify the women safety in OSN environment. When it came to predicting the maximum safety analysis, however, traditional methodologies came up short. This study, then, employs a decision tree (WSP-DT) classifier to make predictions about women’s safety. After considering the Twitter dataset for system implementation, it is pre-processed to get rid of the blanks and the unknowns. The tweets were then processed by a natural language toolkit (NLTK) that handled tasks including tokenization, case-conversion, stop-word detection, stemming, and lemmatization. Next, we create a text blob protocol to determine the positive, negative, and neutral polarity of pre-processed tweets. To further extract the data characteristics based on word and character frequency, term frequency-inverse document frequency (TF-IDF) is used. At last, a decision tree classifier was used, based on several rounds of training, to determine if a tweet was phoney or real. Testing on the Twitter dataset demonstrates that the proposed WSP-DT classifier outperforms the competition in simulations.

Full Text