Towards developing speaker diarization for parent-child interactions

Abhejay Murali,Dwight Irvin,John H Hansen,Meena Chandra Shekar,Jay Buzhardt,Satwik Dutta

doi:10.1121/10.0015551

Abstract

Daily interactions of children with their parents are crucial for spoken language skills and overall development. Capturing such interactions can help to provide meaningful feedback to parents as well as practitioners. Naturalistic audio capture and developing further speech processing pipeline for parent-child interactions is a challenging problem. One of the first important steps in the speech processing pipeline is Speaker Diarization—to identify who spoke when. Speaker Diarization is the method of separating a captured audio stream into analogous segments that are differentiated by the speaker’s (child or parent’s) identity. Following ongoing COVID-19 restrictions and human subjects research IRB protocols, an unsupervised data collection approach was formulated to collect parent-child interactions (of consented families) using LENA device—a light weight audio recorder. Different interaction scenarios were explored: book reading activity at home and spontaneous interactions in a science museum. To identify child’s speech from a parent, we train the Diarization models on open-source adult speech data and children speech data acquired from LDC (Linguistic Data Consortium). Various speaker embeddings (e.g., x-vectors, i-vectors, resnets) will be explored. Results will be reported using Diarization Error Rate. [Work sponsored by NSF via Grant Nos. 1918032 and 1918012.]

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards developing speaker diarization for parent-child interactions

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Naoyuki Kanda ... Shota Horiguchi
-
Naoyuki Kanda, et. al.Naoyuki Kanda ... Shota Horiguchi
01 Dec 2019
01 Dec 2019

Graph attention-based deep embedded clustering for speaker diarization
Yi Wei ... Zhen Yang
Speech Communication | VOL. 155
Yi Wei, et. al.Yi Wei ... Zhen Yang
05 Oct 2023
Speech Communication | VOL. 155

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
Shota Horiguchi ... Yusuke Fujita
-
Shota Horiguchi, et. al.Shota Horiguchi ... Yusuke Fujita
25 Oct 2020
25 Oct 2020

Developing Neural Representations for Robust Child-Adult Diarization
Suchitra Krishnamachari ... Manoj Kumar
-
Suchitra Krishnamachari, et. al.Suchitra Krishnamachari ... Manoj Kumar
19 Jan 2021
19 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards developing speaker diarization for parent-child interactions

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America