This study examines the accuracy of Interaction Detection in Early Childhood Settings (IDEAS), a program that automatically transcribes audio files and estimates linguistic units relevant to speech-language therapy, including part-of-speech units that represent features of language complexity, such as adjectives and coordinating conjunctions. Forty-five video-recorded speech-language therapy sessions involving 27 speech-language pathologists (SLPs) and 56 children were used. The F measure determines the accuracy of IDEAS diarization (i.e., speech segmentation and speaker classification). Two additional evaluation metrics, namely, median absolute relative error and correlation, indicate the accuracy of IDEAS for the estimation of linguistic units as compared with two conditions, namely, Oracle (manual diarization) and Voice Type Classifier (existing diarizer with acceptable accuracy). The high F measure for SLP talk data suggests high accuracy of IDEAS diarization for SLP talk but less so for child talk. These differences are reflected in the accuracy of IDEAS linguistic unit estimates. IDEAS median absolute relative error and correlation values for nine of the 10 SLP linguistic unit estimates meet the accuracy criteria, but none of the child linguistic unit estimates meet these criteria. The type of linguistic units also affects IDEAS accuracy. IDEAS was tailored to educational settings to automatically convert audio recordings into text and to provide linguistic unit estimates in speech-language therapy sessions and classroom settings. Although not perfect, IDEAS is reliable in automatically capturing and returning linguistic units, especially in SLP talk, that are relevant in research and practice. The tool offers a way to automatically measure SLP talk in clinical settings, which will support research seeking to understand how SLP talk influences children's language growth.
Read full abstract