Contrast-enhanced magnetic resonance imaging (MRI) remains the most comprehensive modality to assess juvenile idiopathic arthritis (JIA)-related inflammation and osteochondral damage in the temporomandibular joints (TMJ). This study tested the reliability of a new JIA MRI scoring system for TMJ (JAMRIS-TMJ) and the impact of variations in calibration and reader specialty. Thirty-one MRI exams of bilateral TMJs were scored independently using the JAMRIS-TMJ by 20 readers consisting of radiologists and non-radiologist clinicians in three reading groups, with or without a calibrating atlas and/or tutorial. The inter-reader reliability in the multidisciplinary cohort assessed by the generalizability coefficient was 0.61–0.67 for the inflammatory and 0.66–0.74 for the damage domain. The atlas and tutorial did not improve agreement within radiologists, but improved the agreement between radiologist and non-radiologist groups. Agreements between different calibration levels were 0.02 to 0.08 lower by the generalizability coefficient compared to agreement within calibration levels; agreement between specialty groups was 0.04 to 0.10 lower than within specialty groups. Averaging two radiologists raised the reliability above 0.8 for both domains. Therefore, the reliability of JAMRIS-TMJ was moderate-to-good depending on the presence of specialty and calibration differences. The atlas and tutorial are necessary to improve reliability when the reader cohort consists of multiple specialties.