Abstract

Automated processing and analysis of child speech has been long acknowledged as a harder problem compared to understanding speech by adults. Specifically, conversations between a child and adult involve spontaneous speech which often compounds idiosyncrasies associated with child speech. In this work, we improve upon the task of speaker diarization (determining who spoke when) from audio of child-adult conversations in naturalistic settings. We select conversations from the autism diagnosis and intervention domains, wherein speaker diarization forms an important step towards computational behavioral analysis in support of clinical research and decision making. We train deep speaker embeddings using publicly available child speech and adult speech corpora, unlike predominant state-of-art models which typically utilize only adult speech for speaker embedding training. We demonstrate significant reductions in relative diarization error rate (DER) on DIHARD II (dev) sessions containing child speech (22.88%) and two internal corpora representing interactions involving children with Autism: excerpts from ADOS Mod3 sessions (33.7%) and combination of full-length ADOS and BOSCC sessions (44.99%). Further, we validate our improvements in identifying the child speaker (typically with short speaking time) using the recall measure. Finally, we analyze the effect of fundamental frequency augmentation and the effect of child age, gender on speaker diarization performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.