Abstract
Sentence segmentation is important for improving the human readability of Automatic Speech Recognition (ASR) systems. Although it has been explored through numerous interdisciplinary studies, segmentation of Portuguese is still time-consuming due to the lack of efficient automatic segmentation methods and the reliance on qualified phonetic experts. This paper presents a novel algorithm that efficiently segments speech into sentences by learning the spectrogram of sentences through windows using a classification model developed with an Artificial Neural Network (ANN). Based on our experiments, the beginning part of a European Portuguese (EP) sentence enables better identification of the sentence's boundaries. In addition, a window frame of spectrogram constructed by the previous ending of 100 milliseconds (ms) and the subsequent beginning of 300 ms presents the best performance in the automatic sentence segmentation. As a result, the proposed algorithm can automatically segment Portuguese speech into sentences by analyzing its spectrogram without knowing the speech semantics.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have