Word-Final Phoneme Segmentation Using Cross-Correlation.

Emilian-Erman Mahmut,Vasile Stoicu-Tivadar,Stelian Nicola

doi:10.3233/shti200709

Emilian-Erman Mahmut, Vasile Stoicu-Tivadar + Show 1 more

Open Access

https://doi.org/10.3233/shti200709

Copy DOI

Abstract

The goal of this paper is to present a word-final target phoneme automated segmentation method based on cross-correlation coefficients computed between a reference sound wave and a sample sound wave. Most existing Speech Sound Disorder (SSD) Screening solutions require human intervention to a greater or lesser extent and use segmentation methods based on hard-coded time frames. Moreover, existing solutions extract features from the frequency domain, which entails large amounts of computational power to the detriment of real-time feedback. The pre-processing algorithm proposed in this paper, implemented in a Python version 3.7 script, automatically generates 2 new .wav files corresponding to the phonemes found in word-final position in the initial sound waves. The newly-generated .wav files are meant to be used as valid and homogeneous input in a subsequent classification stage aimed at rigorously discriminating mispronunciations of the target phoneme and assist Speech-Language Pathologists (SLPs) with the SSD screening.

Full Text