There has been extensive recent discussion of the difficulty in estimating meaningful error rates in forensic firearms examinations, and other areas of pattern evidence. The 2016 President’s Council of Advisors on Science and Technology (PCAST) report was clear in criticizing many forensic disciplines as lacking the types of studies that would provide error rate measurements seen in other scientific fields. However, there is a substantial lack of consensus on the approach to measuring an “error rate” for fields such as forensic firearm examination that include in the conclusion scale the “inconclusive” category, as occurs in the Association of Firearm and Tool Mark Examiners (AFTE) Range of Conclusions and many other such fields. Many authors appear to assume the error rate calculated in the binary decision model is the only appropriate way to report errors, but there have been attempts made to adapt the error rate from the binary decision model to scientific fields in which the inconclusive category is viewed as a meaningful outcome of the examination process. In this study we present three neural networks of differing complexity and performance trained to classify the outlines of ejector marks on cartridge cases fired from different firearm models, as a model system for examining the performance of various metrics of error in systems using the inconclusive category. We also discuss an entropy, or information, based method to assess the similarity of classifications to ground truth that is applicable to range of conclusion scales, even when the inconclusive category is used.
Read full abstract