Abstract

The Defense POW/MIA Accounting Agency (DPAA) continues to diligently locate, recover, and identify over 81,000 missing US service members from past conflicts. To fulfill this important mission, massive amounts of information must be integrated from historical records, genealogy records, anthropological data, archaeological data, odontology data, and DNA. Previously a machine learning record-linkage application was developed to integrate DNA Family Reference Samples (FRS) data systems with the DPAA’s master data. This application was shown to link large record systems with a high level of accuracy and precision. Here this work is extended to further optimize the blocking strategy used during record linkage as well as the record match alpha-level threshold for the Bayesian Classifier. Optimization of the blocking strategy was able to improve application run-time per record by 20%. After record-match alpha-level optimization, the application was found to link 89.6% of the record out-group to DPAA master data at an accuracy of 99.6%. The improved run-time efficiency and match rate of the record-linkage pipeline will greatly benefit not only the DPAA’s FRS import process but also the linking of other big data sources supporting the DPAA mission.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call