Abstract

Objectives The identification of non-participants in the Japan National Health and Nutrition Survey (NHNS) requires record linkage with its master sample from the Comprehensive Survey of Living Conditions (CSLC). In principle, we can merge individual records between the two surveys by using key identifiers including household ID, but false matches and nonmatches can occur. We examined combinations of key variables for improving record linkage to identify nonparticipants in the NHNS.Methods We used individual-level data from the NHNS and the CSLC from 1988 to 2015 (except 2012). We extracted from CSLC data individuals in participating unit blocks in the NHNS to merge records between the two surveys. We used four combinations of key variables: prefecture ID, census enumeration district ID, unit block ID, household ID, and household member ID (A); household member ID in A was replaced with sex and birth year and month or age (B); sex and birth year and month or age were added to A (C); two-stage linkage of B and C (D). We classified a sample of individuals into matched participants, unmatched NHNS participants, and unmatched CSLC participants (a proxy for nonparticipants). We compared the percentages of matched NHNS participants and unmatched CSLC participants across the four combinations of key variables.Results We obtained a sample of 455,854 participants from the CSLC and 335,010 from the NHNS. The percentage of matched NHNS participants was highest in A (the upper 90%), followed by D (the lower 90%), B (the lower 90%), and C (the 80%). Compared to C, the percentage of matched NHNS participants was higher by 8-14 percentage points in A and 5-10 percentage points in B. Compared to B, it was higher by 0.1-0.4 percentage points in D. The percentage of unmatched CSLC participants was highest in C, followed by B, D, and A. The percentage of unmatched CSLC participants increased in D from the 20% level in the late 1980s to around 30% in the 1990s and stayed between the 30% level and the lower 40% level in the 2000s.Conclusion The highest percentage of accurate matches of NHNS participants was obtained by considering changes in household member ID and incorrect entries on sex and birth year/month and age, and same-sex multiple births. However, there are limitations in handling unmatched participants due to changes in household ID or other reasons. It is therefore necessary to consider the possibility of false nonmatches included in unmatched CSLC participants in regarding them as non-participants in the NHNS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call