Gender bias may affect assessment in competency-based medical education. To evaluate the association of gender with assessment of internal medicine residents. This multisite, retrospective, cross-sectional study included 6 internal medicine residency programs in the United States. Data were collected from July 1, 2016, to June 30, 2017, and analyzed from June 7 to November 6, 2019. Faculty assessments of resident performance during general medicine inpatient rotations. Standardized scores were calculated based on rating distributions for the Accreditation Council for Graduate Medical Education's core competencies and internal medicine Milestones at each site. Standardized scores are expressed as SDs from the mean. The interaction of gender and postgraduate year (PGY) with standardized scores was assessed, adjusting for site, time of year, resident In-Training Examination percentile rank, and faculty rank and specialty. Data included 3600 evaluations for 703 residents (387 male [55.0%]) by 605 faculty (318 male [52.6%]). Interaction between resident gender and PGY was significant in 6 core competencies. In PGY2, female residents scored significantly higher than male residents in 4 of 6 competencies, including patient care (mean standardized score [SE], 0.10 [0.04] vs 0.22 [0.05]; P = .04), systems-based practice (mean standardized score [SE], -0.06 [0.05] vs 0.13 [0.05]; P = .003), professionalism (mean standardized score [SE], -0.04 [0.06] vs 0.21 [0.06]; P = .001), and interpersonal and communication skills (mean standardized score [SE], 0.06 [0.05] vs 0.32 [0.06]; P < .001). In PGY3, male residents scored significantly higher than female patients in 5 of 6 competencies, including patient care (mean standardized score [SE], 0.47 [0.05] vs 0.32 [0.05]; P = .03), medical knowledge (mean standardized score [SE], 0.47 [0.05] vs 0.24 [0.06]; P = .003), systems-based practice (mean standardized score [SE], 0.30 [0.05] vs 0.12 [0.06]; P = .02), practice-based learning (mean standardized score [SE], 0.39 [0.05] vs 0.16 [0.06]; P = .004), and professionalism (mean standardized score [SE], 0.35 [0.05] vs 0.18 [0.06]; P = .03). There was a significant increase in male residents' competency scores between PGY2 and PGY3 (range of difference in mean adjusted standardized scores between PGY2 and PGY3, 0.208-0.391; P ≤ .002) that was not seen in female residents' scores (range of difference in mean adjusted standardized scores between PGY2 and PGY3, -0.117 to 0.101; P ≥ .14). There was a significant increase in male residents' scores between PGY2 and PGY3 cohorts in 6 competencies with female faculty and in 4 competencies with male faculty. There was no significant change in female residents' competency scores between PGY2 to PGY3 cohorts with male or female faculty. Interaction between faculty-resident gender dyad and PGY was significant in the patient care competency (β estimate [SE] for female vs male dyad in PGY1 vs PGY3, 0.184 [0.158]; β estimate [SE] for female vs male dyad in PGY2 vs PGY3, 0.457 [0.181]; P = .04). In this study, resident gender was associated with differences in faculty assessments of resident performance, and differences were linked to PGY. In contrast to male residents' scores, female residents' scores displayed a peak-and-plateau pattern whereby assessment scores peaked in PGY2. Notably, the peak-and-plateau pattern was seen in assessments by male and female faculty. Further study of factors that influence gender-based differences in assessment is needed.