Abstract

Any specific vowel sound that humans produce can be represented in terms of four perceptual features in addition to the vowel category. They are pitch, loudness, brightness, and roughness. Corresponding acoustic features chosen here are fundamental frequency (fo), sound pressure level (SPL), normalized spectral centroid (NSC), and approximate entropy (ApEn). In this study, thyroarytenoid (TA) and cricothyroid (CT) activations were varied computationally to study their relationship with these four specific acoustic features. Additionally, postural and material property variables such as vocal fold length (L) and fiber stress (σ) in the three vocal fold tissue layers were also calculated. A fiber-gel finite element model developed at National Center for Voice and Speech was used for this purpose. Muscle activation plots were generated to obtain the dependency of postural and acoustic features on TA and CT muscle activations. These relationships were compared against data obtained from previous in vivo human larynx studies and from canine laryngeal studies. General trends are that fo and SPL increase with CT activation, while NSC decreases when CT activation is raised above 20%. With TA activation, acoustic features have no uniform trends, except SPL increases uniformly with TA if there is a co-variation with CT activation. Trends for postural variables and material properties are also discussed in terms of activation levels.

Highlights

  • This study was motivated by a desire to eventually control a voice simulator with inputs related to perception

  • C (r) is the correlation integral computed as suggested in [56], r is the radius of similarity (chosen as 0.2 *variance(x)), and N is the number of samples of the signal x(t)

  • The current study focused on how TA and CT muscle activation levels control various acoustic and posturing features of voice production

Read more

Summary

Introduction

This study was motivated by a desire to eventually control a voice simulator with inputs related to perception. Aside from vowel perception, a sound can be represented in terms of pitch, loudness, and timbre [1]. Timbre is the quality of the sound that differentiates one sound from another when pitch and loudness are the same [2]. It can be divided into two components, brightness and roughness [3]. The four perceptual features can be quantified with fundamental frequency, sound pressure level, spectral content, and aperiodicity. Neglecting additive noise from air turbulence (breathiness or aspiration in a vowel), it is believed that these four perceptual or acoustic features contain many of the characteristics of a given vowel sound [4].

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.