Despite the increasing popularity of peer assessment in tertiary-level interpreter education, very little research has been conducted to examine the quality of peer ratings on language interpretation. While previous research on the quality of peer ratings, particularly rating accuracy, mainly relies on correlation and analysis of variance, latent trait modelling emerges as a useful approach to investigate rating accuracy in rater-mediated performance assessment. The present study demonstrates the use of multifaceted Rasch partial credit modelling to explore the accuracy of peer ratings on English-Chinese consecutive interpretation. The analysis shows that there was a relatively wide spread of rater accuracy estimates and that statistically significant differences were found between peer raters regarding rating accuracy. Additionally, it was easier for peer raters to assess some students accurately than others, to peer-assess target language quality accurately than the other rating domains, and to provide accurate ratings to English-to-Chinese interpretation than the other direction. Through these findings, latent trait modelling demonstrates its capability to produce individual-level indices, measure rater accuracy directly, and accommodate sparse data rating designs. It is therefore hoped that substantive inquiries into peer assessment of language interpretation could utilise latent trait modelling to move this line of research forward.