Abstract

This work intends to establish a logistic mathematical model to classify whether an online conversation is or is not a grooming conversation. This work is important for a number of reasons: the increasing number of the Internet users across the globe, the increasing number of social media, increasing in the number and types of crime on the Internet, and the crime of sexual abuse in children impacts both physically and physiologically. Online grooming is the most reported suspected Internet activities in 2009–2010 according to Child Exploitation and Online Protection, which is a part of the UK's Home Office Serious Organized Crime agency. Around 160 online script conversations are analyzed to determine characteristics of a grooming conversation. Those scripts are obtained randomly from http://www.perverted-justice.com and www.literotika.com. The characteristics are divided into 20 types. The scripts are divided into two sets: 100 scripts for the training set and 59 scripts for the testing set. As the results, five most relevant grooming characteristics are identified from the paired t-test, and a logistic model is established on this basis. The model is evaluated using the testing data set, and the results show that the model has relatively good performance with 95% accuracy, 96% true positive, 4% false positive, 93% true negative, and 7% false negative.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call