Purpose To quantify factors affecting test-retest variability of threshold measurements over a series of 3 serial visual fields (VF). Design Prospective comparative observational study. Participants Forty-one normals, 10 suspects and 35 stable glaucoma patients. Methods All subjects performed 3 standard and 3 short-wavelength automated perimetry (SWAP) VFs. At each VF location, severity (defined as age-corrected total deviation) and test-retest variability (TRV), defined as the standard deviation of 3 serial threshold values, were calculated. A multiple regression model (constructed separately for standard VF and SWAP) incorporated 13 factors: severity, location, eccentricity, study group, diagnosis, superior versus inferior hemifield, nasal versus temporal hemifield, one-versus-two thresholds, age, mean pupil size, pupil size variability, between-subject variation, and residual variation. Main outcome measures Variability in threshold sensitivity VF values. Results Mean TRV (± standard deviation) for normal, suspect and glaucoma eyes, respectively, was: 1.28 ± 0.87, 1.53 ± 1.04 and 2.20 ± 1.79 dB for standard VF, and 1.87 ± 1.35, 1.86 ± 1.24 and 2.68 ± 1.85 dB for SWAP. The contribution of each factor to the model for standard VF and SWAP (SWAP in parentheses) were: severity 15.5% (6.9%); location 2.7% (4.1%); eccentricity 1.1% (0.64%); diagnosis 2.9% (5.9%); “superior versus inferior” hemifield 0.17% (1.7%); “nasal versus temporal” hemifield 0.06% (0.02%); one-versus-two thresholds 0.04% (0.16%); age 0.1% (0.06%); mean pupil size 0.59% (0.1%); pupil size variability 3.2% (2.8%); between-subject 8.0% (13.5%) and residual variation 61.0% (66.6%). Excluding between-subject and residual variation, the 11-factor model was able to account for less than one third of the variability seen in both standard VF and SWAP. Conclusions Severity of defect and between subject variation exerted the largest effect on TRV. However, even if all 11 factors could be adjusted for, it would reduce the magnitude of TRV by only 30%. More work is needed to reduce the remaining variability inherent in psychophysical testing and to better understand the intrinsic physiological variability present both in healthy and diseased eyes. It is possible that a larger number of VFs used for the calculation of TRV might further reduce the magnitude of the remaining variability found in this study.