To quantify the test-retest reliability of 3 patient-reported outcome measures of pain for people living with phantom limb pain (PLP) and assess the impact of test-retest errors on future research and clinical decisions. Thirty-nine participants (30 males), mean (SD) age: 55 (16), mean (SD) years postamputation: 6.8 (8.3), reported their PLP levels on a visual analogue scale (VAS) for pain intensity, the revised short-form McGill Pain Questionnaire (SF-MPQ-2), and a pain diary, on 2 occasions 7 to 14 days apart. Mean systematic change, within-subjects SD, limits of agreement (LOA), coefficient of variation, and the intraclass correlation coefficient (ICC) were quantified alongside their respective 95% confidence intervals (95% CIs). Systematic learning effects (mean changes) were not clinically relevant across the VAS, SF-MPQ-2, and pain diary. Within-subject SDs (95% CI) were 11.8 (9.6-15.3), 0.9 (0.7-1.2), and 8.6 (6.9-11.5), respectively. LOA (95% CI) were 32.6 (26.5-42.4), 2.5 (2-3.3), and 23.9 (19.2-31.8), respectively. ICCs (95% CI) were 0.8 (0.6-0.9), 0.8 (0.7-0.9), and 0.9 (0.8-0.9), respectively, but may have been inflated by sample heterogeneity. The test-retest errors allowed detection of clinically relevant effect sizes with feasible sample sizes in future studies, but individual errors were large. For people with PLP, a pain intensity VAS, the SF-MPQ-2, and a pain diary show an acceptable level of intersession reliability for use in future clinical trials with feasible sample sizes. Nevertheless, the random error observed for all 3 of the pain outcome measures suggests they should be interpreted with caution in case studies and when monitoring individuals' clinical status and progress.