The speakers in the room corpus

Aaron Lawson,Jeff Hetherly,Paul Gamble,Martin Graciarena,Colleen Richey,Todd Stavish,Zeb Armstrong,Cory Stephenson,Karl Ni,Maria Barrios

doi:10.1121/1.5036132

Aaron Lawson, Jeff Hetherly + Show 8 more

https://doi.org/10.1121/1.5036132

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The speakers in the room (SITR) corpus is a collaboration between Lab41 and SRI International, designed to be a freely available data set for speech and acoustics research in noisy room conditions. The main focus of the corpus is on distant microphone collection in a series of four rooms of different sizes and configurations. There are both foreground speech and background adversarial sounds, played through high-quality speakers in each room to create multiple, realistic acoustic environments. The foreground speech is played from a randomly rotating speaker to emulate head motion. Foreground speech consists of files from LibriVox audio collections and the background distractor sounds will consist of babble, music, HVAC, TV/radio, dogs, vehicles, and weather sounds drawn from the MUSAN collection. Each room has multiple sessions to exhaustively cover the background foreground combinations, and the audio is collected with twelve different microphones (omnidirectional lavalier, studio cardioid, and piezoelectric) placed strategically around the room. The resulting data set was designed to enable acoustic research on event detection, background detection, source separation, speech enhancement, source distance, sound localization, as well as speech research on speaker recognition, speech activity detection, speech recognition, and language recognition.

Full Text