Abstract

Language ENvironment Analysis (LENA) is a light weight audio capture device commonly used to monitor language including frequency of parent-child interactions for research purposes. However, various factors including price and technical limitations often limit the use of LENA for low-income families. Using a smartphone can be a feasible alternative when LENA is not available to practitioners/families. Over the past three decades, significant advancements have resulted in smartphone platforms/microphones/technology, allowing for high quality recorded audio which are available in most households. In this study we compare audio quality and measure performance of several Automatic Speech recognition (ASR) engines on audio captured from iPhone and Android relative to LENA devices. Families who consented in this study recorded reading activities with their children at home using both personal smartphones and LENA. Some challenges we found include recording synchronization, unnecessary background noises, and uneven room acoustics. Audio quality comparison is measured using Speech-Signal-to-noise ratio (SSNR) metric. Both open-source and fine-tuned ASR models are explored, with results reported using overall Word Error Rate (WER), as well as separated by speaker–Child versus Adult. Results show that smartphone platforms can be used versus LENA for child-adult language assessment. [Work sponsored by NSF through Grant Nos. 1918032 and 1918012.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call