Abstract

Purpose: This paper addresses concerns related to missing data in the UCR. We compare several methods using multiple imputation in an effort to improve the data. Methods: We develop a two-stage procedure for imputing missing values in the UCR. First, we use agencies' past reporting to impute missing values. Then, after merging in demographic data from the census, we use multiple imputation to fill in the remaining missing values. Several methods are considered, including multivariate normal imputation (MVNI) and multiple imputation by chained equations (MICE). Results: Methods are compared using a complete subset of the UCR with values systematically deleted. We find that a model using MICE with random forest prediction produces results that are comparable to the methods NACJD uses when preparing the county level UCR data. Conclusions: We recommend applying multiple imputation to the UCR rather than relying on the county level data published by NACJD. Not only are our results similar to NACJD's, but additionally, an approach using multiple imputation accounts for the uncertainty in imputed values, so that correct standard errors can be obtained in subsequent analyses.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.