ABSTRACT This study aims to determine whether adding an additional modality (ultrasound tongue imaging) improves the inter-rater reliability of phonetic transcription in childhood speech sound disorders (SSDs) and whether it enables the identification of different or additional errors in children’s speech. Twenty-three English speaking children aged 5–13 years with SSDs of unknown origin were recorded producing repetitions of /aCa/ for all places of articulation with simultaneous audio and ultrasound. Two types of transcriptions were undertaken off-line: (1) ultrasound-aided transcription by two ultrasound-trained speech-language pathologists (SLPs) and (2) traditional phonetic transcription from audio recordings, completed by the same two SLPs and additionally by two different SSD specialist SLPs. We classified transcriptions and errors into ten different subcategories and compared: the number of consonants identified as in error by each transcriber; the inter-rater reliability; and the relative frequencies of error types identified by the different types of transcriber. Results showed that error-detection rates were different across the transcription types, with the ultrasound-aided transcribers identifying more errors than were identified using traditional audio-only transcription. Analysis revealed that these additional errors were identified on the dynamic ultrasound image despite being transcribed as correct, suggestive of subtle motor speech differences. Interrater reliability for classifying the type of error was substantial (κ = 0.72) for the ultrasound-aided transcribers and ranged from fair to moderate for the audio-only transcribers (κ = 0.38 to 0.52). Ultrasound-aided transcribers identified more instances of increased variability and abnormal timing errors than the audio-only transcribers.