Abstract

Autism Spectrum Disorder (ASD) is characterized by difficulties in social communication, social interactions and repetitive behaviors. Some of these difficulties are apparent in the speech characteristics of ASD children who are verbal. Developing algorithms that can extract and quantify speech features that are unique to ASD children is, therefore, extremely valuable for assessing the initial state of each child and their development over time. An important component of such algorithms is speaker diarization in the noisy clinical environments where ASD children are diagnosed. Here we present a Gaussian Mixture Model (GMM) approach for speaker diarization that was applied to 34 recordings from clinical assessments using the Autism Diagnostic Observation Schedule (ADOS). We used mel-frequency cepstral coefficients (MFCC) and pitch based features to classify segments containing speech of the child, therapist, parent, movement noises (chair, toys, etc.) and simultaneous speech. We achieved an accuracy of 89% in identifying segments with children's speech and an accuracy of 74.5% in identifying children's and therapists' speech segments. These accuracy rates are similar to the diarization accuracy rates reported by previous similar studies, thereby demonstrating a promising route for the automated assessment of speech in children with ASD.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.