Abstract

This paper touches upon a model of simultaneous acoustic masking, which detects speech signal components perceived by a human’s auditory system. A simultaneous masking algorithm on the basis of this model is proposed. It is shown that, after simultaneous masking, a signal becomes a binary structure that reflects the harmonic structure of a vocalized sequence. It is experimentally proven that this structure can be used to detect key speech segments (from the standpoint of perception by an auditory system). This structure serves as a basis for an algorithm of high-quality segmentation of a speech signal into vocalized and unvocalized segments, which does not require learning before use. The joint use of the algorithms for simultaneous masking and speech signal segmentation is tested, and their performance is evaluated.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.