Abstract
This paper describes a procedure used to link Medicaid claims data to California vital statistics records for very low birthweight infants. The linkage involved about 53,000 infants born from 1980 to 1987 and 1.46 million claims for delivery/birth-related hospital admissions during the same period. Because the two data files did not share a unique identifier, record linkage required combining evidence across several linking variables: delivery hospital, delivery/birth date or hospitalization period, names, mother's age, and zip code. To combine the various pieces of evidence, we used record linkage theory to compute scores that measure the likelihood of a match, i.e., that two records correspond to the same delivery. These scores appropriately weight the various pieces of evidence for or against a match. Implementation required dealing with large amounts of missing data in one of the files, errors and variations in reported names, and the need to minimize the number of incorrect links. The approach applies to a wide range of linkage problems. The ability to combine existing datasets to form new datasets containing analysis variables from each facilitates analyses that would otherwise be impossible, or prohibitively expensive.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.