Articles published on Reverberation Time
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
3229 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.ijporl.2026.112772
- Apr 1, 2026
- International journal of pediatric otorhinolaryngology
- Ananya Kanigalpula + 2 more
Can you understand me in class? Effects of age and reverberation on speech recognition in school- age children.
- Research Article
- 10.3390/acoustics8010017
- Mar 7, 2026
- Acoustics
- Jukka Keränen + 2 more
Enclosed learning spaces, e.g., classrooms, are used in most schools. Open learning spaces, which enable teaching more than one group of students at a time, have become increasingly popular. A recent survey showed that acoustic satisfaction was lower among teachers working in open learning spaces. Our purpose was to compare the acoustic conditions of these learning space types. We investigated the room acoustic quality of 73 learning spaces in 20 schools. Ten schools involved only enclosed and ten both open and enclosed learning spaces. Measurements concerned speech transmission index, STI, background noise level, LAeq, and reverberation time, T. Variation in results in both learning space types was rather large. In enclosed learning spaces, STI varied within 0.64–0.83, LAeq within 25–47 dB, and T within 0.34–0.82 s. The corresponding variations in open learning spaces were 0.47–0.91, 29–44 dB, and 0.44–0.72 s. The differences between enclosed and open learning spaces were surprisingly small. Due to the different intended uses of these space types, Finnish target values are tighter for open than for enclosed learning spaces. These target values were fulfilled in 56% of enclosed and 9% of open learning spaces. The more frequent violation of target values in open learning spaces was due to the STI being too large at longer distances. Our study provides suggestive evidence that the room acoustic conditions are worse in open than enclosed learning spaces. Further research is needed to prove whether room acoustic conditions could explain worse acoustic satisfaction in teachers.
- Research Article
- 10.3390/acoustics8010016
- Mar 3, 2026
- Acoustics
- Teo Poldrugovac + 2 more
The Church of St. Francis in Pula, Croatia, is a well-preserved example of Franciscan gothic sacral architecture from the late 13th century. As preaching was highly valued by the Franciscan order as a way of communicating with the faithful, the study is focused on determining whether speech intelligibility in the church would have been adequate for successful communication between priests and their audience. The archaeoacoustic analysis of the church was performed in four stages: (1) in situ acoustic measurements in the present state, (2) development and calibration of the model of the present state based on measurement results, (3) development of the two models of the presumed historical state based on the calibrated model and historical data, and (4) prediction of acoustic conditions in the present and the historical states in terms of reverberation time T30 and of speech intelligibility in terms of speech transmission index STI. The factors considered in the study were (1) acoustics of the church, (2) profile of the audience (friars and the faithful), (3) layout of the audience areas (choir area in the front of the nave for the friars, back area of the nave for the faithful), (4) positions of the speech sources (altar for addressing the friars, pulpit for addressing the faithful), (5) occupancy (unoccupied and fully occupied church), (6) language used in liturgical ceremonies (Latin and native language), and (7) language proficiency of the audience (native speakers, users of a second language). The results show that (1) fair speech intelligibility (STI ≥ 0.45 for the faithful as native speakers, STI ≥ 0.50 for friars as non-native speakers of Latin) can be achieved for 50% of the audience in the choir area and for the entire audience in the back area in favourable conditions (fully occupied church, audience addressed from dedicated speaker positions), (2) the position of the pulpit (close to the audience and considerably elevated above it) is more favourable than the position of the altar (remote, barely elevated above the audience), and (3) in unoccupied conditions, fair speech intelligibility can still be achieved in at least 50% of the back audience area with the faithful gathered close to the pulpit, while it is not possible for the front audience area addressed from the altar. The summary conclusion is that the church of St. Francis in its presumed historical layout(s) would fulfil its primary function in a limited capacity. Fair speech intelligibility would likely have been sufficient for the audience to follow liturgical ceremonies conducted in the church, but not without difficulty.
- Research Article
- 10.3390/buildings16040808
- Feb 16, 2026
- Buildings
- Zhirui Zhu + 4 more
The acoustic design of outpatient halls is often ignored, yet the acoustic environment significantly impacts patients’ physical and mental well-being as well as their visit experience. This paper takes the outpatient hall of a hospital in South China as a case, employs in-situ acoustic measurement, and conducts a quantitative analysis of its indoor acoustic environment through acoustic simulation. The in-situ measurements show that the noise level, speech intelligibility and reverberation time in the hall all fail to meet standard requirements. The poor acoustic quality is mainly due to the lack of acoustic design. Consequently, this study proposes measures to improve the outpatient hall’s acoustic environment from two aspects, namely sound absorption and building form design. These measures include sound absorption treatment, adjustment of the hall’s floor height and optimization of planar length-to-width ratio. The redesigned outpatient hall plan demonstrates an evident enhancement in acoustic quality and validates the effectiveness of the proposed redesign strategy. This study can provide practical guidance for the design and acoustic renovation of healthcare buildings and can also offer new insights for the redesign of hospital outpatient halls from the perspective of improving the acoustic environments.
- Research Article
- 10.1177/1351010x261421483
- Feb 15, 2026
- Building Acoustics
- Antonella Bevilacqua + 1 more
The Royal Festival Hall (RFH) in London, opened in 1951, is renowned architecturally but has long-standing acoustic challenges affecting clarity, warmth, and reverberation. This study investigates the hall’s acoustic performance and evaluates interventions to improve its auditory qualities other than pursuing the idea of changing theatre chairs since the focus has been on this topic. Acoustic measurements of existing theatre chairs were conducted using a Microflown sensor, and the data were incorporated into a geometrical acoustics model to simulate the effects of a sustainable plant-based leather (Piñatex) covering. Simulations considered both unoccupied and occupied conditions, alongside modifications to architectural elements, including curtains on balconies, ceiling reflectors, and increased floor and wall mass. Results show that Piñatex improves reverberation time at mid and high frequencies under unoccupied conditions but has negligible effect when the hall is fully occupied. Additional architectural interventions enhance reverberation and other parameters, including clarity (C50, C80) and strength (G), although low-frequency response remains limited. This study highlights that the hall’s shallow and wide form constrains acoustics, suggesting that structural adjustments or reinforcement sound systems are necessary for an optimal room’s response.
- Research Article
- 10.3390/buildings16040749
- Feb 12, 2026
- Buildings
- Deniss Mironovs
The acoustic design of learning spaces is commonly carried out using geometrical acoustics simulations or analytical calculations. While 3D simulations provide high accuracy, they are time-consuming and resource-intensive, whereas analytical calculations are limited to reverberation time. This study proposes a set of empirical regression formulas for estimating the speech clarity index C50 from the reverberation time T30 and the source–receiver distance r. The models are intended for design verification, allowing a quick assessment of whether a proposed acoustic solution meets speech clarity criteria without using numerical simulations. A total of 455 measurement entries from 28 rooms were analyzed, representing three categories of acoustic conditions. Polynomial and logarithmic regression models were developed and evaluated using three statistical criteria: the adjusted coefficient of determination (Radj2), the Akaike Information Criterion (AIC), and the root mean square error (RMSE). The results show that logarithmic models generally provide better fit consistency across room types, whereas polynomial models describe lower frequency bands more accurately. The proposed relationships demonstrate practical potential for predicting C50 for mid–high frequencies in real rooms using analytically obtained T30 values and geometric distances. The proposed models are intended for early-stage building and classroom design, where numerical simulations are not yet available.
- Research Article
- 10.1007/s10055-026-01312-7
- Feb 6, 2026
- Virtual Reality
- Mona Alawadh + 3 more
Abstract We introduce a new approach for constructing immersive virtual spaces by generating comprehensive 3D voxelised models that encompass both geometric and semantic scene representations from a single 360 $${}^{\circ}$$ RGB-D input. The proposed approach utilises a deep convolutional neural network for semantic scene completion (SSC), allowing the estimation of complete semantics and geometries of the scene. We design MDBNet a dual head model that simultaneously processes RGB and depth data using a perspective camera. Depth information is encoded using a flipped transcribed signed distance function (F-TSDF), capturing essential geometric shape characteristics. We extend the inference capabilities of MDBNet on RGB-D input of the perspective camera to accommodate 360 $${}^{\circ}$$ RGB-D by proposing MDBNet360. We employ RGB spherical-to-cubic projection and 3D rotation for depth point clouds, allowing for virtual reality (VR) space design with comprehensive spatial coverage. To our knowledge, this is the first work to extend a pre-trained SSC model, originally using perspective camera RGB-D input, to infer a 3D model from 360 $${}^{\circ }$$ RGB-D input. To assess acoustic properties, we measure parameters such as early decay time (EDT) and reverberation time (RT60) using the exponential sine sweep method (ESS). We used Unity with the Steam Audio plug-in for conducting simulations in virtual space. The proposed framework demonstrates better virtual space reconstruction and immersive sound generation, advancing semantically rich and spatially accurate virtual environments compared to the state-of-the-art (SOTA). Code and rendered sounds are available on GitHub: https://github.com/MonaIA1/Repo360 .
- Research Article
- 10.1121/10.0042532
- Feb 1, 2026
- The Journal of the Acoustical Society of America
- Kosuke Goto + 1 more
The measurement of sound absorption coefficients in a reverberation chamber often involves uncertainties owing to the insufficient diffusivity of the room sound field, which results from the low modal density at lower frequencies. This paper proposes a measurement method that uses damping density (DD) to address this problem. The DD treats the damping constants (DCs) at each frequency as a probability density function, and the DCs at each frequency are calculated from the room impulse response. A preliminary study showed that the proposed method yielded lower reverberation times (RTs) than conventional methods while maintaining measurement stability. Furthermore, the results confirmed that the proposed method successfully evaluated the initial decay characteristics. Measurements of 200 mm-thick urethane foam in an actual reverberation chamber demonstrated that the proposed method yielded intermediate RTs between early decay times and conventional RTs in the low-frequency range (below 315 Hz) under empty room conditions and achieved improved measurement stability across multiple measurement paths. The resulting sound absorption coefficients showed the smallest relative errors compared with the theoretical values in the 80-250 Hz range, except at 200 Hz.
- Research Article
- 10.32734/gfj.v4i1.23911
- Jan 29, 2026
- Global Forest Journal.
- Marlena Wojnowska + 1 more
The article provides an overview of information on the impact of wood panels and wood-based composites on room acoustics. It focuses on key parameters such as reverberation time (RT), sound absorption, and speech and music clarity (Clarity, STI). Wood-based panels and composites show strong potential as sustainable acoustic solutions, reducing reverberation time and improving clarity, while requiring careful optimization of porosity, thickness, and perforation to balance performance across frequencies
- Research Article
- 10.3390/app16020819
- Jan 13, 2026
- Applied Sciences
- Silvana Sukaj + 2 more
Acoustically, Baroque theatres have prove remarkably appropriate for opera, and, in the past, little distinction was drawn in design between drama and opera use, except for the inclusion of an orchestra pit, because both music and words were audible and balanced, reverberation times being shorter than in concert halls but longer than in speech auditoria. In a drama configuration, scenery is set in the fly tower on stage, while for opera pieces, in most cases, the orchestra pit platform raises to the main floor level of the stalls to set additional seats rows. Considering the characteristics of the Opera di Roma (IT), the case study, the main physical parameters that contribute to the sound quality are evaluated and compared in relation to the pit position level, in order to understand the possible merits of the covering seats on the pit surface for drama representations and, more generally, for speech activities. Eight different configurations are compared and, to evaluate the acoustic parameters’ sensitivity, the JND (just noticeable difference) is analyzed. The parameters’ trend is described.
- Research Article
- 10.3390/buildings16020331
- Jan 13, 2026
- Buildings
- Xiaoyun Yue + 5 more
As unique forms of intangible cultural heritage of Inner Mongolia, traditional musical instruments from the region have undergone significant changes alongside socioeconomic development and evolving performance styles. The performance environment has transitioned from early outdoor and non-fixed venues to professional concert halls. Existing research has demonstrated a correlation between the acoustic quality of performance halls and their objective architectural acoustic parameters. However, no studies have been conducted in China on the acoustic parameters suitable for the performance environments of traditional Inner Mongolian musical instruments. This study determined the optimal acoustic environment for performances of traditional musical instruments, unique to Inner Mongolia, by employing computer simulations and subjective listening experiments in representative performance spaces. Participants were asked to select preferred audio samples of different reverberation times, generated by convolving the impulse responses of simulated spatial models with dry recordings of the instruments. Statistical analysis of the results revealed that the optimal reverberation times for traditional Inner Mongolian instruments are 1.2 s and 1.4 s in a theater space, and 0.9 s and 1.1 s in a rectangular space. Furthermore, under the influence of different factors, the four instruments exhibited distinct preferences for optimal reverberation values in the sampled spaces.
- Research Article
- 10.1186/s13636-025-00443-0
- Jan 13, 2026
- Journal on Audio, Speech, and Music Processing
- Hanyu Meng + 4 more
Abstract Estimating acoustic context parameters is essential for characterizing acoustic environments, thereby enhancing immersive perception in spatial audio creation and improving speech enhancement and dereverberation algorithms. In this paper, we propose a unified deep learning based framework that estimates various acoustic contexts, including frequency-dependent reverberation time ( $$T_{30}$$ T 30 ), direct-to-reverberant ratio, clarity ( $$C_{50}$$ C 50 ), room geometry, and sound source orientation from first-order Ambisonics (FOA) speech recordings. Our framework employs a novel feature, termed the Spectro-Spatial Covariance Vector (SSCV), which efficiently represents the temporal, spectral, and spatial information of FOA signals. This feature can be effectively utilized by several deep neural networks as back-ends. Experimental results demonstrate that the proposed framework, which incorporates spatial information derived from FOA recordings, significantly outperforms existing methods based solely on spectral information from single-channel audio, achieving more than a 50% reduction in estimation error across all acoustic context estimation tasks. Additionally, we introduce FOA-Conv3D, a novel back-end network that effectively utilizes the SSCV feature through a 3D convolutional encoder. FOA-Conv3D outperforms currently widely applied deep learning frameworks such as convolutional neural network and recurrent convolutional neural network back-end architectures in acoustic parameter and orientation estimation tasks, exhibiting greater robustness under both pink and babble noise conditions. Finally, ablation studies reveal the relative contributions of spectral, interaural level difference, and interaural phase difference cues within the SSCV representation.
- Research Article
1
- 10.1016/j.jvoice.2025.12.012
- Jan 1, 2026
- Journal of voice : official journal of the Voice Foundation
- Sven Franz + 3 more
Subjective assessment of voice and speech disorders, often based on hoarseness or breathiness, suffers from limited interrater reliability. Objective, composite metrics-such as the acoustic voice quality index (AVQI) and the acoustic breathiness index (ABI)-offer more consistent and reproducible alternatives for diagnosis and monitoring, although they show some sensitivity to recording conditions. The aim of this study is to analyze the contributions and dependencies of individual acoustic parameters to composite metrics. Building on this, the influence of room acoustics on objective measures and their parameters is systematically investigated. Using close-microphone recordings with negligible reverberation, the contribution of individual parameters to the composite measures was determined through variance and value range analyses. In 35 speech and language therapy (SLT) rooms, acoustic parameters such as reverberation time, impulse response, and background noise were measured. Their influence on objective measures from real voice samples was analyzed using mixed linear regression. Variance analyses show that, in particular, smoothed cepstral peak prominence (CPPS) substantially contributes to the predictive power of the composite measures and has a dominant influence on the investigated voice quality metrics. The results also demonstrate a strong impact of room acoustics on measurement accuracy - especially for mildly pathological voices and for reverberant speech recordings. Reverberation time and clarity measures were found to be crucial influencing factors and predictors. The investigated voice quality measures are largely determined by CPPS. However, CPPS is heavily influenced by room acoustic properties, which can cause unreliable prediction with indices such as AVQI and ABI. Despite these limitations, CPPS remains a strong predictor of perceptual grading and breathiness. For reliable use of objective voice quality metrics in clinical settings, standardization or optimization of recording conditions, or development of more robust analytical methods, is essential. These findings support the refinement of objective voice diagnostics and promote evidence-based approaches in SLT.
- Research Article
- 10.1121/10.0042163
- Jan 1, 2026
- The Journal of the Acoustical Society of America
- Yanan Du + 1 more
With development of intelligent cabin, specialized psychoacoustic evaluation method is needed to link subjective and objective parameters for audio quality design in vehicle cabins, in which traditional psychoacoustic metrics are limited in capturing multidimensional perceptual differences. This study proposes a pentatonic harmonic audio system evaluation (PHASE) framework to evaluate audio quality in vehicle cabins. The reverberation time and frequency responses were measured in five representative cars. The harmonic features were derived from the frequency responses based on the pentatonic scale. Six music clips were recorded in driver's and rear-right seats to generate ten sets (five driver's seat, four rear-right seats, and one baseline) of stimuli for evaluation. Subsequently, 50 participants evaluated audio quality for all stimuli on the 100-point scales across 9 dimensions: timbre brightness/darkness, timbre warmth/coolness, clarity, distortion, dynamic range, bass quality, spatial impression, localization, and distance. The evaluation models were established through multiple linear regression between the subjective rating values and harmonic features. These models showed strong explanatory power (R2 = 0.852 - 0.987) across all dimensions, effectively capturing multidimensional in-vehicle auditory perception. The PHASE framework demonstrates strong potential for evaluating audio quality in enclosed or irregular acoustic environments such as immersive virtual spaces and home theaters.
- Research Article
- 10.1051/e3sconf/202668502007
- Jan 1, 2026
- E3S Web of Conferences
- Erick Teguh Leksono + 2 more
Poor acoustics may impede communication and user experience, and yet traditional design approaches often ignore the varying acoustic needs. The present paper systematically reviews pertinent literature to assess how flexible geometric design options (e.g., adjustable surfaces, modular panels, or dynamic surfaces) affect relevant acoustic parameters (reverberation time, clarity, and sound distribution) in comparison with conventional empty auditorium designs. By following PRISMA Protocol (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) applied strictly to ensure transparency and reproducibility of the research process, 25 experimental and simulation studies from Scopus, Science Direct, Journal of The Acoustical Society of America, Sage, MDPI, and others were analyzed. Results indicate that proper geometric design can improve acoustics quality variation, particularly in multipurpose applications; however, implementation must consider cost and technical complexity. Therefore, conclusions and recommendations serve as an evidence-based guideline for architects and acoustical engineers involved in designing adaptive performance spaces in terms of real-time adjustment technologies and smart materials. On the other hand, new standards for acoustic evaluation of dynamic designs should also be created.
- Research Article
- 10.4103/nah.nah_260_25
- Jan 1, 2026
- Noise & health
- Daniel Onut Badea + 4 more
This review investigated the effects of external and classroom noise on school-aged children's cognitive performance. The analysis identified exposure patterns across studies; quantified effects on attention, memory, and reading processes; and developed a conceptual model connecting acoustic input, processing effort, and academic performance. A structured search was conducted to identify observational and experimental studies on noise exposure and cognitive outcomes in school environments. Eligible studies quantified noise using the equivalent continuous sound level, signal-to-noise ratio (SNR), and reverberation time (RT) and included outcomes in attention, memory, and reading tasks. Data were mapped across environmental noise and classroom acoustic conditions. Classroom exposure levels commonly ranged between 65 and 77 dB(A), and external exposure from aircraft and road-traffic sources often exceeded 60 dB(A) during teaching periods. Lower SNRs and longer RTs reduced speech clarity and increased the processing effort needed to follow spoken information across the study design. Performance decreased in tasks involving attention, verbal memory, and reading accuracy. Children from lower socioeconomic backgrounds, bilingual learners, and pupils with weaker attention control exhibited larger performance reductions under identical acoustic conditions. Integrating these results produced a conceptual model in which external noise at school entry and classroom noise during instruction form a continuous exposure sequence that increases processing effort and reduces the learning capacity. Noise affects children by increasing processing load and reducing the cognitive resources available for learning. The model outlines this mechanism. The findings indicate acoustic improvements in classrooms and systematic noise monitoring in school environments.
- Research Article
- 10.1121/10.0042241
- Jan 1, 2026
- The Journal of the Acoustical Society of America
- Dingding Xie + 2 more
This study introduces directional coherence loss coefficients (DCLCs) to quantify the angular distribution of sound field coherence loss, which arise from localized scattering distributions within an enclosure, from a receiver's perspective. Sound fields in the room are sampled by a spherical receiver and decomposed into plane waves using spherical harmonics expansion, yielding directional impulse responses (IRs). The time-dependent coherence coefficients between directional IRs in furnished rooms and their empty counterparts are analyzed for each direction. DCLCs are derived from these coherence coefficients and decide the transition between the coherent component-mainly representing specular reflections-and the incoherent component, accounting for scattering contributions from interior elements. This research extracts DCLCs from rooms with varying transducer positions founded on wave-based simulations and measurements from rooms with different element distributions. A hybrid model is proposed to reconstruct the sound field in a room with a single diffusive wall, where the coherent component is computed from the empty room case, and the incoherent component is simulated stochastically, with their relative weighting decided by DCLCs. Directional IRs from the hybrid model exhibit agreement with ground truth in terms of reverberation time, clarity, kurtosis, and spatial cross correlation coefficients, verifying the ability of DCLCs to characterize localized scattering distributions in enclosures.
- Research Article
- 10.3389/fpubh.2026.1781216
- Jan 1, 2026
- Frontiers in public health
- Karol Jędrasiak + 1 more
Advances in diffusion-based and neural rendering architectures have enabled the creation of synthetic audiovisual content that closely replicates natural facial dynamics, speech production, and environmental context. These developments pose a growing risk to clinical medicine and dentistry, where authentic audiovisual data support remote clinical assessment, communication, and medico-legal documentation. This study introduces an interpretable multimodal framework for deepfake detection that integrates visual, acoustic, and cross-modal coherence features, with decision thresholds derived exclusively from authentic recordings to ensure transparency and forensic accountability. Using the DeepFake RealWorld dataset comprising 46,371 audiovisual clips, including 77% with audio, we evaluated 47 descriptors across optical, bioacoustic, and synchronization domains. Clinical relevance was evaluated through simulated dental teleconsultations. Cross-modal metrics, particularly lip-speech synchrony (Δp = 0.21-0.22), phoneme-viseme alignment (Δp = 0.21), a widely used audio visual consistency cue in multimodal deepfake detection, identity coherence (Δp = 0.19), and scene-audio semantic consistency (Δp = 0.18) demonstrated the strongest discriminatory performance, with prevalence ratios of up to 2.7. Acoustic markers, including reduced jitter, shimmer, and shortened reverberation time (RT60; 0.12 s in synthetic vs. 0.28 s in real recordings), provided additional robustness. The framework maintained performance degradation below 15% under platform-scale compression and recapture artifacts. Additionally, the proposed framework was benchmarked against a standard open-source texture-oriented baseline detector based on the Xception architecture, with clip-level ROC AUC and balanced accuracy reported on the original clips and under the same platform transformations used in the robustness analysis. Simulated dental teleconsultations revealed that manipulated recordings introduce inconsistencies in mandibular motion, prosody-related facial dynamics, and ambient acoustic plausibility (mean Δp = 0.18; PR = 2.3), confirming the clinical relevance of multimodal coherence analysis. These results position coherence-based detection as a reliable, transparent, and domain-appropriate approach for safeguarding audiovisual integrity in remote dentistry, medicine, and related digital health applications.
- Research Article
- 10.1177/1351010x251390616
- Dec 25, 2025
- Building Acoustics
- Yusuke Hioka + 4 more
The current paper studies the differences in speech intelligibility in noise measured under reproduced acoustic environments implemented using different recording and rendering techniques. Acoustics of two rooms with different volume and reverberation time were reproduced by spherical harmonics-based spatial sound reproduction and the speech intelligibility under reproduced acoustic environments were compared to that in the original rooms by conducting subjective listening tests. Four implementations of spatial sound reproduction realised by the combinations of two recording techniques (using first and higher order Ambisonics microphones) and two rendering techniques (using a headphone and loudspeaker array) were evaluated. The experimental results found the speech intelligibility under reproduced acoustic environments implemented by using either a first- or higher-order Ambisonics microphone and a 32-ch loudspeaker array achieved to replicate results not significantly different from that observed in the original real rooms when the room is highly reverberant. The same implementation with the higher-order Ambisonics microphone also most accurately replicated the effect of angular separation between the speech and noise sources as well as source distance on the speech intelligibility. The results also suggest that the technique used for rendering would have larger effect on reproducibility of speech intelligibility in the real room than the recording technique whereas the technique used for recording would have larger effect on reproducibility of the effect of angular separation between speech and noise sources in the real room.
- Research Article
- 10.3390/acoustics8010002
- Dec 24, 2025
- Acoustics
- Hui Ma + 3 more
Using efficient voice alarms to ensure safe evacuation is important during emergencies, especially for the elderly. Factors that have important influence on speech perceptions have been investigated for several years. However, relatively few studies have specifically explored the key factors influencing perceptions of voice alarms in emergency situations. This study investigated the combined effects of speech rate (SR), signal-to-noise ratio (SNR), and reverberation time (RT) on older people’s perception of voice alarms. Thirty older adults were invited to evaluate speech intelligibility, listening difficulty, and perceived urgency after hearing 48 different voice alarm conditions. For comparison, 25 young adults were also recruited in the same experiment. The results for older adults showed that: (1) When SR increased, speech intelligibility significantly decreased, and listening difficulty significantly increased. Perceived urgency reached its maximum at the normal speech rate for older adults, in contrast to young adults, for whom urgency was greatest at the fast speech rate. (2) With the rising SNR, speech intelligibility and perceived urgency significantly increased, and listening difficulty significantly decreased. In contrast, with the rising RT, speech intelligibility and perceived urgency significantly decreased, while listening difficulty significantly increased. (3) RT exerted a relatively stronger independent influence on speech intelligibility and listening difficulty among older adults compared to young adults, which tended not to be substantially moderated by SR or SNR. The interactive effect of SR and RT on perceived urgency was significant for older people, but not significant for young people. These findings provide referential strategies for designing efficient voice alarms for the elderly.