Abstract

When we hear a new voice on the radio, we can tell whether the speaker is an adult or a child. We can also extract the message of the communication without being confused by the size information. This shows that auditory signal processing is scale invariant, automatically segregating information about vocal tract shape from information about vocal tract length. Patterson and colleagues have performed a series of experiments to measure the characteristics of size/shape perception [e.g., Smith et al., J. Acoust. Soc. Am. 117(1), 305–318 (2005)], and provided a mathematical basis for auditory scale invariance in the form of the stabilized wavelet-Mellin transform (SWMT) [Irino and Patterson, Speech Commun. 36(3–4), 181–203 (2002)]. The mathematics of the SWMT dictates the optimal form of the auditory filter, insofar as it must satisfy minimal uncertainty in a time-scale representation [Irino and Patterson, J. Acoust. Soc. Am. 101(1), 412–419 (1997)]. The resulting gammachirp auditory filter is an asymmetric extension of the earlier gammatone auditory filter—one which can explain the level dependence of notched-noise masking. Thus, although it is not immediately intuitive, speaker size perception and auditory filter shape are both aspects of a larger, unified framework for auditory signal processing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.