Abstract

Analysis of publicly available human and drug trafficking crime data faces the challenge of finding a comprehensive dataset that includes a sufficiently large number of crime incidents. Our proposed methodology attempts to address this challenge by using entity resolution techniques to merge multiple state-wide crime datasets and a county-wide incident report dataset to get a clearer picture of a category of criminal activity in a geographical area. This methodology combines incident reports, crime reports, and court records to close any gaps that may be present in a single data source. We apply this methodology to create a dataset that includes drug and human trafficking related crimes and incidents from three distinct sources (from Louisville Open Data Crime Reports, Federal Bureau of Investigation Kentucky Crime Incidents, and the Kentucky Online Offender Lookup website) to provide researchers data to study the link between drug and human trafficking related crimes. In a case study performed with the new merged dataset, an XGBoost classifier was able to label a 7-day sliding time window, within any given county, as containing a human trafficking related incident or not with a Matthews correlation coefficient of 0.86.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call