Patient-reported outcome measures (PROMs) are the only systematic approach through which the patient's perspective can be considered by surgeons (in determining a procedure's efficacy or appropriateness) or healthcare systems (in the context of value-based healthcare). PROMs in registries enable international comparison of patient-centered outcomes after total joint arthroplasty, but the extent to which those scores may vary between different registry populations has not been clearly defined. (1) To what degree do mean change in general and joint-specific PROM scores vary across arthroplasty registries, and to what degree is the proportion of missing PROM scores in an individual registry associated with differences in the mean reported change scores? (2) Do PROM scores vary with patient BMI across registries? (3) Are comorbidity levels comparable across registries, and are they associated with differences in PROM scores? Thirteen national, regional, or institutional registries from nine countries reported aggregate PROM scores for patients who had completed PROMs preoperatively and 6 and/or 12 months postoperatively. The requested aggregate PROM scores were the EuroQol-5 Dimension Questionnaire (EQ-5D) index values, on which score 1 reflects "full health" and 0 reflects "as bad as death." Joint-specific PROMs were the Oxford Knee Score (OKS) and the Oxford Hip Score (OHS), with total scores ranging from 0 to 48 (worst-best), and the Hip Disability and Osteoarthritis Outcome Score-Physical Function shortform (HOOS-PS) and the Knee Injury and Osteoarthritis Outcome Score-Physical Function shortform (KOOS-PS) values, scored 0 to 100 (worst-best). Eligible patients underwent primary unilateral THA or TKA for osteoarthritis between 2016 and 2019. Registries were asked to exclude patients with subsequent revisions within their PROM collection period. Raw aggregated PROM scores and scores adjusted for age, gender, and baseline values were inspected descriptively. Across all registries and PROMs, the reported percentage of missing PROM data varied from 9% (119 of 1354) to 97% (5305 of 5445). We therefore graphically explored whether PROM scores were associated with the level of data completeness. For each PROM cohort, chi-square tests were performed for BMI distributions across registries and 12 predefined PROM strata (men versus women; age 20 to 64 years, 65 to 74 years, and older than 75 years; and high or low preoperative PROM scores). Comorbidity distributions were evaluated descriptively by comparing proportions with American Society of Anesthesiologists (ASA) physical status classification of 3 or higher across registries for each PROM cohort. The mean improvement in EQ-5D index values (10 registries) ranged from 0.16 to 0.33 for hip registries and 0.12 to 0.25 for knee registries. The mean improvement in the OHS (seven registries) ranged from 18 to 24, and for the HOOS-PS (three registries) it ranged from 29 to 35. The mean improvement in the OKS (six registries) ranged from 15 to 20, and for the KOOS-PS (four registries) it ranged from 19 to 23. For all PROMs, variation was smaller when adjusting the scores for differences in age, gender, and baseline values. After we compared the registries, there did not seem to be any association between the level of missing PROM data and the mean change in PROM scores. The proportions of patients with BMI 30 kg/m 2 or higher ranged from 16% to 43% (11 hip registries) and from 35% to 62% (10 knee registries). Distributions of patients across six BMI categories differed across hip and knee registries. Further, for all PROMs, distributions also differed across 12 predefined PROM strata. For the EQ-5D, patients in the younger age groups (20 to 64 years and 65 to 74 years) had higher proportions of BMI measurements greater than 30 kg/m 2 than older patients, and patients with the lowest baseline scores had higher proportions of BMI measurements more than 30 kg/m 2 compared with patients with higher baseline scores. These associations were similar for the OHS and OKS cohorts. The proportions of patients with ASA Class at least 3 ranged across registries from 6% to 35% (eight hip registries) and from 9% to 42% (nine knee registries). Improvements in PROM scores varied among international registries, which may be partially explained by differences in age, gender, and preoperative scores. Higher BMI tended to be associated with lower preoperative PROM scores across registries. Large variation in BMI and comorbidity distributions across registries suggest that future international studies should consider the effect of adjusting for these factors. Although we were not able to evaluate its effect specifically, missing PROM data is a recurring challenge for registries. Demonstrating generalizability of results and evaluating the degree of response bias is crucial in using registry-based PROMs data to evaluate differences in outcome. Comparability between registries in terms of specific PROMs collection, postoperative timepoints, and demographic factors to enable confounder adjustment is necessary to use comparison between registries to inform and improve arthroplasty care internationally. Level III, therapeutic study.