Differences in speech intelligibility in noise measured under spatial sound reproduction implemented with varying recording and rendering techniques
The current paper studies the differences in speech intelligibility in noise measured under reproduced acoustic environments implemented using different recording and rendering techniques. Acoustics of two rooms with different volume and reverberation time were reproduced by spherical harmonics-based spatial sound reproduction and the speech intelligibility under reproduced acoustic environments were compared to that in the original rooms by conducting subjective listening tests. Four implementations of spatial sound reproduction realised by the combinations of two recording techniques (using first and higher order Ambisonics microphones) and two rendering techniques (using a headphone and loudspeaker array) were evaluated. The experimental results found the speech intelligibility under reproduced acoustic environments implemented by using either a first- or higher-order Ambisonics microphone and a 32-ch loudspeaker array achieved to replicate results not significantly different from that observed in the original real rooms when the room is highly reverberant. The same implementation with the higher-order Ambisonics microphone also most accurately replicated the effect of angular separation between the speech and noise sources as well as source distance on the speech intelligibility. The results also suggest that the technique used for rendering would have larger effect on reproducibility of speech intelligibility in the real room than the recording technique whereas the technique used for recording would have larger effect on reproducibility of the effect of angular separation between speech and noise sources in the real room.
- Research Article
- 10.3397/in_2023_0065
- Nov 30, 2023
- INTER-NOISE and NOISE-CON Congress and Conference Proceedings
The present study investigates the differences in speech intelligibility in noise under Ambisonics-based virtual acoustic environments realised by various recording/rendering methods. Subjective listening tests were conducted under four different implementations of Ambisonics-based virtual acoustic environments reproducing a seminar room using different modes of recording and rendering techniques. For recording, both first and higher order Ambisonics microphone arrays were utilised while for rendering headphones (binaural) and 32-ch loudspeaker array were used. The results were also compared against the data collected in the real room. The experimental results suggest the speech intelligibility under virtual acoustic environments does not differ significantly regardless of the modes of recording and rendering, however it was generally lower than that in their original room. The effect of spatial release from masking was also observed under the virtual acoustic environments when either/both the higher order Ambisonics microphone array or/and the 32-channel loudspeaker array were utilised for the implementation, however, it was limited to the case when the speech source was located close to the listener.
- Research Article
- 10.1121/10.0022700
- Oct 1, 2023
- The Journal of the Acoustical Society of America
Noise reduction strategies based on time-frequency masking have been shown to provide substantial improvements in the intelligibility of speech in noise for both listeners with normal and impaired hearing. Previous studies on this type of noise reduction have involved intact, well-articulated speech spoken by healthy talkers (i.e., not disordered speech). However, the prevalence of speech disorders such as dysarthria is substantial, particularly in the aging population, which is highly susceptible to the impacts of hearing loss. We recently demonstrated the considerable impact of background noise on dysarthric speech as well as the effectiveness of time-frequency masking to improve the intelligibility of this disordered speech in noise for listeners with normal hearing (Borrie et al., 2023). Here, we present data on the feasibility of time-frequency masking to increase the intelligibility of dysarthric speech for listeners with sensorineural hearing impairment. Preliminary results indicate that the Ideal Quantized Mask, a method of time-frequency masking, significantly improves percent words correct scores of dysarthric speech in noise for listeners with hearing loss. Results will be discussed in relation to the specific impact of background noise on dysarthric speech, and the relationship between sensorineural hearing loss and dysarthric speech.
- Research Article
8
- 10.1016/j.apacoust.2020.107707
- Nov 4, 2020
- Applied Acoustics
Speech intelligibility in noise with varying spatial acoustics under Ambisonics-based sound reproduction system
- Research Article
13
- 10.1177/2331216515618902
- Dec 1, 2015
- Trends in Hearing
The aim of the present study was to determine the relations between the intelligibility of speech in noise and measures of auditory resolution, loudness recruitment, and cognitive function. The analyses were based on data published earlier as part of the presentation of the Auditory Profile, a test battery implemented in four languages. Tests of the intelligibility of speech, resolution, loudness recruitment, and lexical decision making were measured using headphones in five centers: in Germany, the Netherlands, Sweden, and the United Kingdom. Correlations and stepwise linear regression models were calculated. In sum, 72 hearing-impaired listeners aged 22 to 91 years with a broad range of hearing losses were included in the study. Several significant correlations were found with the intelligibility of speech in noise. Stepwise linear regression analyses showed that pure-tone average, age, spectral and temporal resolution, and loudness recruitment were significant predictors of the intelligibility of speech in fluctuating noise. Complex interrelationships between auditory factors and the intelligibility of speech in noise were revealed using the Auditory Profile data set in four languages. After taking into account the effects of pure-tone average and age, spectral and temporal resolution and loudness recruitment had an added value in the prediction of variation among listeners with respect to the intelligibility of speech in noise. The results of the lexical decision making test were not related to the intelligibility of speech in noise, in the population studied.
- Research Article
104
- 10.1109/tassp.1976.1162824
- Aug 1, 1976
- IEEE Transactions on Acoustics, Speech, and Signal Processing
This paper presents the results of an examination of rapid amplitude compression following high-pass filtering as a method for processing speech, prior to reception by the listener, as a means of enhancing the intelligibility of speech in high noise levels. Arguments supporting this particular signal processing method are based on the results of previous perceptual studies of speech in noise. In these previous studies, it has been shown that high-pass filtered/clipped speech offers a significant gain in the intelligibility of speech in white noise over that for unprocessed speech at the same signal-to-noise ratios. Similar results have also been obtained for speech processed by high-pass filtering alone. The present paper explores these effects and it proposes the use of high-pass filtering followed by rapid amplitude compression as a signal processing method for enhancing the intelligibility of speech in noise. It is shown that this new method resuits in a substantial improvement in the intelligibility of speech in white noise over normal speech and over previously implemented methods.
- Research Article
3
- 10.1097/aud.0000000000001235
- May 30, 2022
- Ear and Hearing
Objectives:To study the effectivity of a transformed NAL non-linear version 2 (NAL-NL2) gain prescription for percutaneous bone conduction devices (BCDs) and to investigate how to take into account output constraints for the lower frequencies.Design:The NAL-NL2 prescription was converted to a bone conduction prescription rule. Adaptations were needed, as this converted rule prescribes more output at low frequencies than the device delivers. Three adaptations with different audibility and compression were compared. Setting 1 (S1, “optimal audibility”) had most audibility due to adapted frequency-dependent compression, setting 2 (S2, “moderate audibility”) had moderate output reduction below 1 kHz, and setting 3 (S3, “reduced audibility, least distortion”) had most output reduction. Eighteen experienced BCD users rated their relative sound quality in paired comparisons for different sounds (own voice, mixed voices, traffic noise, and music). In addition speech intelligibility in quiet and noise were assessed.Results:The relative sound quality rating for the adapted prescriptions varied between the stimuli: more low-frequency sound was preferred for music (S1 over S3), and less low-frequency sound was preferred for the own voice (S2 and S3 over S1). No differences in quality rating were found for mixed voices or traffic noise. Speech intelligibility in quiet scores at 45 dB SPL was significantly lower for S3 than for S1. Speech intelligibility in noise was significantly reduced in all settings and S3 yielded significantly better speech intelligibility in noise than S1.Conclusions:With a moderate gain reduction for low frequencies to comply with device constraints the transformed NAL-NL2 prescription was found suitable for fitting BCDs. Perceived sound quality depended on the gain settings, but also on the sound spectra and how the sound was appreciated. A moderate gain reduction below 1 kHz seems to be the optimal adaptation as it has a neutral or positive relative sound quality for all stimuli without negative effects on Speech intelligibility. The NAL-NL2-BC prescribed a sufficient amount of gain, as indicated by the speech tests.
- Research Article
8
- 10.1016/j.csl.2019.02.003
- Mar 9, 2019
- Computer Speech & Language
On the feasibility of using a bispectral measure as a nonintrusive predictor of speech intelligibility
- Research Article
2
- 10.1007/s12070-024-05173-x
- Nov 2, 2024
- Indian journal of otolaryngology and head and neck surgery : official publication of the Association of Otolaryngologists of India
Prominent ear deformity, affecting up to 5% of the white population, is characterized by an absent antihelical fold and conchal cartilage hypertrophy. These anatomical changes can potentially alter auditory function, including speech intelligibility in noise. To evaluate changes in auditory function, particularly speech intelligibility in noise, following otoplasty for prominent ear deformities. This prospective study included 25 patients undergoing otoplasty from January 2018 to December 2023. Audiometric evaluations, including pure-tone audiometry, tympanometry, and speech intelligibility scores (SIS), were conducted preoperatively and three months postoperatively at different signal-to-noise ratios (SNRs) and noise orientations. Preoperative SIS was significantly reduced at a -5 dB SNR with posterior noise, suggesting a pinna shielding effect. Postoperative evaluations showed no significant changes in pure-tone averages or speech discrimination scores. Otoplasty does not impair auditory function but may alter sound directionality in noisy environments, highlighting the need for further research.
- Research Article
22
- 10.1097/01.aud.0000145109.90767.ba
- Oct 1, 2004
- Ear and Hearing
To evaluate the improvement in speech intelligibility in noise obtained with an assistive real-time fixed endfire array of bidirectional microphones in comparison with an omnidirectional hearing aid microphone in a realistic environment. The microphone array was evaluated physically in anechoic and reverberant conditions. Perceptual tests of speech intelligibility in noise were carried out in a reverberant room, with two types of noise and six different noise scenarios with single and multiple noise sources. Ten normal-hearing subjects and 10 hearing aid users participated. The speech reception threshold for sentences was measured in each test setting for the omnidirectional microphone of the hearing aid and for the hearing aid in combination with the array with one and three active microphones. In addition, the extra improvement of five active array microphones, relative to three, was determined in another group of 10 normal-hearing listeners. Improvements in speech intelligibility in noise obtained with the array relative to an omnidirectional microphone depend on noise scenario and subject group. Improvements up to 12 dB for normal-hearing and 9 dB for hearing-impaired listeners were obtained with three active array microphones relative to an omnidirectional microphone for one noise source at 90 degrees . For three uncorrelated noise sources at 90 degrees, 180 degrees, and 270 degrees, improvements of approximately 9 dB and 6 dB were obtained for normal-hearing and hearing-impaired listeners, respectively. Even with a single noise source at 45 degrees, benefits of 4 dB were achieved in both subject groups. Five active microphones in the array can provide an additional improvement at 45 degrees of approximately 1 dB, relative to the three-microphone configuration for normal-hearing listeners. These improvements in signal-to-noise ratio can be of great benefit for hearing aid users, who have difficulties with speech understanding in noisy environments.
- Research Article
9
- 10.1121/1.1906933
- Sep 1, 1952
- The Journal of the Acoustical Society of America
The effect upon the intelligibility of speech in noise of the interaction of sharp frequency limiting and severe peak clipping was studied. The results are compared with previously reported results of similar tests with frequency-limited speech signals that were not subjected to amplitude distortion. The intelligibility of unclipped speech, relative to that of the peak-clipped signal under corresponding experimental conditions, is a function of the signal-to-noise (S/N) ratio under test and is, to a rough approximation, independent of the frequency range of the speech signal passed. At high S/N ratios, the intelligibility of the unclipped speech signal is higher than that of the severely peak-clipped signal. Under low S/N ratios, however, the intelligibility of the latter is considerably higher than that of the unclipped signal.
- Research Article
62
- 10.1097/00003446-199606000-00004
- Jun 1, 1996
- Ear and Hearing
The development of a test of virtual speech intelligibility in noise that enables assessment in typical, everyday listening situations. To eliminate extraneous confounding factors, digital signal processing was incorporated to simulate listening environments and source locations and allow presentation of stimuli via earphones. Source-to-eardrum transfer functions measured on KEMAR for various source locations in anechoic and reverberant environments were used to process monosyllabic words and speech-spectrum noise. Speech intelligibility was measured for three speech and noise configurations in two environments using an adaptive procedure to determine the signal-to-noise (S/N) ratio for 50% intelligibility. Normal-hearing listeners achieved 50% intelligibility of monosyllabic words at significantly lower S/N ratios in a virtual anechoic environment than in a virtual reverberant environment. Speech intelligibility improved significantly in both environments when the speech and noise sources were separated, but the intelligibility gain in the anechoic environment was four times larger than in the reverberant environment. This test is easy to administer and score, and it provides a means for measuring: 1) the effects of separating speech and noise sources and 2) the effects of reverberation on speech intelligibility in noise while eliminating confounding factors such as calibration.
- Research Article
26
- 10.1001/jamaoto.2017.0745
- Jun 22, 2017
- JAMA Otolaryngology–Head & Neck Surgery
ImportanceTo date, no randomized clinical trial on the comparison between simultaneous and sequential bilateral cochlear implants (BiCIs) has been performed.ObjectiveTo investigate the hearing capabilities and the self-reported benefits of simultaneous BiCIs compared with those of sequential BiCIs.Design, Setting, and ParticipantsA multicenter randomized clinical trial was conducted between January 12, 2010, and September 2, 2012, at 5 tertiary referral centers among 40 participants eligible for BiCIs. Main inclusion criteria were postlingual severe to profound hearing loss, age 18 to 70 years, and a maximum duration of 10 years without hearing aid use in both ears. Data analysis was conducted from May 24 to June 12, 2016.InterventionsThe simultaneous BiCI group received 2 cochlear implants during 1 surgical procedure. The sequential BiCI group received 2 cochlear implants with an interval of 2 years between implants.Main Outcomes and MeasuresFirst, the results 1 year after receiving simultaneous BiCIs were compared with the results 1 year after receiving sequential BiCIs. Second, the results of 3 years of follow-up for both groups were compared separately. The primary outcome measure was speech intelligibility in noise from straight ahead. Secondary outcome measures were speech intelligibility in noise from spatially separated sources, speech intelligibility in silence, localization capabilities, and self-reported benefits assessed with various hearing and quality of life questionnaires.ResultsNineteen participants were randomized to receive simultaneous BiCIs (11 women and 8 men; median age, 52 years [interquartile range, 36-63 years]), and another 19 participants were randomized to undergo sequential BiCIs (8 women and 11 men; median age, 54 years [interquartile range, 43-64 years]). Three patients did not receive a second cochlear implant and were unavailable for follow-up. Comparable results were found 1 year after simultaneous or sequential BiCIs for speech intelligibility in noise from straight ahead (difference, 0.9 dB [95% CI, –3.1 to 4.4 dB]) and all secondary outcome measures except for localization with a 30° angle between loudspeakers (difference, –10% [95% CI, –20.1% to 0.0%]). In the sequential BiCI group, all participants performed significantly better after the BiCIs on speech intelligibility in noise from spatially separated sources and on all localization tests, which was consistent with most of the participants’ self-reported hearing capabilities. Speech intelligibility-in-noise results improved in the simultaneous BiCI group up to 3 years following the BiCIs.Conclusions and RelevanceThis study shows comparable objective and subjective hearing results 1 year after receiving simultaneous BiCIs and sequential BiCIs with an interval of 2 years between implants. It also shows a significant benefit of sequential BiCIs over a unilateral cochlear implant. Until 3 years after receiving simultaneous BiCIs, speech intelligibility in noise significantly improved compared with previous years.Trial Registrationtrialregister.nl Identifier: NTR1722
- Research Article
- 10.1097/00043764-197109000-00023
- Sep 1, 1971
- Journal of Occupational and Environmental Medicine
A significant improvement in speech intelligibility in a background noise was shown in a group of industrial subjects conditioned to working in noise compared with a control group of university staff. Progressive deterioration of speech intelligibility in noise was found with noise-induced hearing loss after losses had occurred at the 2 kHz pure-tone audiometric frequency.
- Research Article
37
- 10.1080/00140137008931173
- Sep 1, 1970
- Ergonomics
A significant improvement in speech intelligibility in a background noise was shown in a group of industrial subjects conditioned to working in noise compared with a control group of university staff. Progressive deterioration of speech intelligibility in noise was found with noise-induced hearing loss after losses had occurred at the 2 kHz pure-tone audiometric frequency.
- Conference Article
12
- 10.1109/icassp.2012.6288794
- Mar 1, 2012
In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.