In this paper, the results of speech intelligibility subjective assessment of Ukrainian speech monosyllabic sound combinations against background noise and reverberation through articulation tests are presented. The evaluation was carried out with the help of specially developed software that allowed automating and thus greatly facilitating and accelerating the procedure of articulation tests. Special text and sound tables of the Ukrainian speech monosyllables of the consonant-vowel-consonant (CVC) type were developed for the tests. The recordings of pure signals were made in the muffled room of the Acoustics and Acoustoelectronics Department of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute". The recording was performed at a frequency of 22050 Hz and a bit depth of 16 bits. Speech monosyllables were read using a verbal environment. Listening was done for four situations: pure language; speech distorted by noise; speech distorted by reverberation; speech distorted by the combined effect of noise and reverberation. In the first case, speech monosyllables of 3 articulation tables were listened, each of which contained 50 monosyllables. In the second case, speech distorted by the additive noise with the signal-to-noise ratios minus 10 dB, 0 dB and plus 10 dB was listened. In this case, models of white, pink and brown noises were used, the masking property of which is rather well-studied. In the third case, the reverberant speech for reverberation times from 0.3 to 2.7 s was modeled by convolution of pure speech signals with impulse characteristics of various rooms, and in the fourth case the joint action of pink noise and reverberation was considered. It turned out that the masking ability of white noise exceeds one for brown noise for signal-to-noise ratios (SNR) less than minus 5 dB, which is not entirely consistent with previous preliminary predictive estimates. Signals listening had been made by two ways, namely through the headphones and by means of acoustic monitors. The SNR varied in the range ‑10…+10 dB, and the reverberation time varied in the range 0.3…2.7 s. In addition, it turned out that listening to speech distorted by noise through acoustic monitors could lead to a significant increase in the intelligibility of speech, compared to the case of listening through headphones. With a signal-to-noise ratio of minus 10 dB, the values of speech intelligibility scores increased from 0.1-0.3 to 0.85-0.93. Similar results were obtained for the reverberation action: the speech intelligibility increased from 0.65 to 0.94 for the reverberation time of 2.7 s. This speech intelligibility growth can be explained, firstly, by the action of early reflections. Secondly, usage of two loudspeakers as sound sources and also binaural listening can be also considered as reasons of speech intelligibility increase. Nevertheless, these reasons for increasing the intelligibility of speech are hardly the only ones. Therefore, it is necessary to further study the causes of the revealed significant increase in speech intelligibility. Among such reasons can be both features of the developed automated system of articulation tests, and features of the psychophysical state of listeners.Ref. 13, fig. 5.
Read full abstract