Abstract

Intimate partner violence (IPV) is a problem that has been studied by different researchers to determine the factors that influence its occurrence, as well as to predict it. In Peru, 68.2% of women have been victims of violence, of which 31.7% were victims of physical aggression, 64.2% of psychological aggression, and 6.6% of sexual aggression. Therefore, in order to predict psychological, physical and sexual intimate partner violence in Peru, the database of denouncements registered in 2016 of the “Ministerio de la Mujer y Poblaciones Vulnerables” was used. This database is comprised of 70510 complaints and 236 variables concerning the characteristics of the victim and the aggressor. First of all, we used Chi-squared feature selection technique to find the most influential variables. Next, we applied the SMOTE and random under sampling techniques to balance the dataset. Then, we processed the balanced dataset using cross validation with 10 folds on Multinomial Logistic Regression, Random Forest, Naive Bayes and Support Vector Machines classifiers to predict the type of partner violence and compare their results. The results indicate that the Multinomial Logistic Regression and Support Vector Machine classifiers performed better on different scenarios with different feature subsets, whereas the Naive Bayes classifier showed inferior. Finally, we observed that the classifiers improve their performance as the number of features increased.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call