ObjectivesThe UK's Improving Access to Psychological Therapies (IAPT) programme uses the Patient Health Questionnaire Depression Scale (PHQ-9; Kroenke, Spitzer, & Williams, 2001, J. Gen. Intern. Med., 16, 606) and Generalized Anxiety Disorder Scale (GAD-7; Spitzer et al., 2006, Arch. Intern. Med., 166, 1092) to assess patients' symptoms of depression and anxiety respectively. Data are typically collected via telephone or face-to-face; however, no study has statistically investigated whether the questionnaires' items operate equivalently across these modes of data collection. This study aimed to address this omission.Methods & ResultsQuestionnaire data from patients registered with an IAPT service in London (N = 23,672) were examined. Confirmatory factor analyses suggested that unidimensional factor structures adequately matched observed face-to-face and telephone data for the PHQ-9 and GAD-7. Invariance analyses revealed that while the PHQ-9 had equivalent factor loadings and latent means across data collection methods, the GAD-7 had equivalent factor loadings but unequal latent means. In support of the scales' convergent validity, positive associations between scores on the PHQ-9 and GAD-7 emerged.ConclusionsWith the exception of the GAD-7's latent means, the questionnaires' factor loadings and latent means were equivalent. This suggests that clinicians may meaningfully compare PHQ-9 data collected face-to-face and by telephone; however, such comparisons with the GAD-7 should be done with caution.Practitioner pointsThe PHQ-9 and GAD-7's factor loadings were equivalent across data collection methods.Only the PHQ-9's latent means were equivalent across data collection methods.Clinicians may be confident collecting PHQ-9 data by telephone and face-to-face and, then, comparing such data.Caution is recommended when determining clinical effectiveness using telephone and face-to-face GAD-7 data.More psychometric research is warranted.