Abstract

AbstractThe dependency of a speech recognition system on the accent of a user leads to the variation in its performance, as the people from different backgrounds have different accents. Accent labeling and conversion have been reported as a prospective solution for the challenges faced in language learning and various other voice-based advents. In the English TTS system, the accent labeling of unregistered words is another very important link besides the phonetic conversion. Since the importance of the primary stress is much greater than that of the secondary stress, and the primary stress is easier to call than the secondary stress, the labeling of the primary stress is separated from the secondary stress. In this work, the labeling of primary accents uses a labeling algorithm that combines morphological rules and machine learning; the labeling of secondary accents is done entirely through machine learning algorithms. After 10 rounds of cross-validation, the average tagging accuracy rate of primary stress was 94%, the average tagging accuracy rate of secondary stress was 94%, and the total tagging accuracy rate was 83.6%. This perceptual study separates the labeling of primary and secondary accents providing the promising outcomes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.