Abstract
Abstract This paper proposes an automatic correction of stutter involving repetitions, prolongations and long pauses in disfluent speech using signal processing techniques. Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coefficients (LPC) are used to extract the features. Short time energy and correlation between frames are the parameters considered for the removal of repetitions and prolongations, respectively. For long pauses, the input speech samples are rate converted to a sampling rate of 22.05 kHz and long pauses (samples) are removed, retaining the natural pause between words. There is limited work reported on automatic stutter correction using signal processing methods and work on correcting the three types of stutters simultaneously has not been reported. An accuracy of 88.35%, 94.3% and 97.5% is obtained for repetitions, prolongations, and long pauses respectively, with average time for correction being 2 seconds on an Intel 8th gen i5 system, making it suitable for time-critical applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.