Abstract
Cost considerations are critical in the analysis and prevention of traffic crashes. Integration of cost data into crash datasets facilitates the crash-cost analyses with all their related attributes. It is, however, a challenging task because of the lack of availability of unique identifiers across the databases and because of privacy and confidentiality regulations. This study performed a record linkage comparison between the deterministic and probabilistic approaches using attributes matching techniques with numerical distance and weight patterns under the Fellegi–Sunter approach. As a result, the deterministic algorithm developed using the exact match of the 14-digit police accident record number had an overall matching performance of 52.38% of real matched records, while the probabilistic algorithm had an overall matching performance of 70.41% with a quality measurement of the sensitivity of 99.99%. The deterministic approach was thus outperformed by the probabilistic approach by approximately 20% of records matched. The probabilistic matching with numerical variables seems to be a good matching strategy supported by quality variables. On record matching, a multivariable regression model was developed to model medical costs and identify factors that increase the costs of treating injured claimants in Puerto Rico.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Transportation Research Record: Journal of the Transportation Research Board
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.