Abstract

The intelligibility of interrupted speech (interrupted over time) and checkerboard speech (interrupted over time-by-frequency), both of which retained a half of the original speech, was examined. The intelligibility of interrupted speech stimuli decreased as segment duration increased. 20-band checkerboard speech stimuli brought nearly 100% intelligibility irrespective of segment duration, whereas, with 2 and 4 frequency bands, a trough of 35%-40% appeared at the 160-ms segment duration. Mosaic speech stimuli (power was averaged over a time-frequency unit) yielded generally poor intelligibility ( ⩽10%). The results revealed the limitations of underlying auditory organization for speech cues scattered in a time-frequency domain.

Highlights

  • Speech perception has been investigated by degrading speech signals in time and frequency

  • These observations were supported by the analysis using a generalized linear mixed model (GLMM) with a logistic linking function as implemented in an add-in for JMP (SAS Institute Inc., 2018)

  • A growing body of literature has revealed the relationship between speech intelligibility and the modulation power spectrum (MPS) (Singh and Theunissen, 2003; Elliott and Theunissen, 2009; Venezia et al, 2016; Sohoglu and Davis, 2020; Flinker et al, 2019)

Read more

Summary

Introduction

Speech perception has been investigated by degrading speech signals in time and frequency. Periodic interruption [Fig 1(b)], introduced systematically by Miller and Licklider (1950), has been one of the typical techniques that degrades speech in the time domain [e.g., Powers and Wilcox (1977), Shafiro et al (2016), and Shafiro et al (2018)], along with local timereversal [e.g., Steffen and Werani (1994), Saberi and Perrott (1999), Matsuo et al (2020), Ueda et al (2017), Ueda et al (2019), Ueda and Ciocca (2021), and Ueda and Matsuo (2021)]. A group of techniques in this type employs small units of time and frequency, and transforms the signal properties within each unit. Pointillistic speech (Kidd et al, 2009), mosaic speech (Nakajima et al, 2018; Santi et al, 2020), and pixelated speech (Schlittenlacher et al, 2019) come into this group

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call