Creating speech zones with self-distributing acoustic swarms

Malek Itani,Tuochao Chen,Takuya Yoshioka,Shyamnath Gollakota

doi:10.1038/s41467-023-40869-8

Malek Itani, Tuochao Chen + Show 2 more

Open Access

https://doi.org/10.1038/s41467-023-40869-8

Copy DOI

Abstract

Imagine being in a crowded room with a cacophony of speakers and having the ability to focus on or remove speech from a specific 2D region. This would require understanding and manipulating an acoustic scene, isolating each speaker, and associating a 2D spatial context with each constituent speech. However, separating speech from a large number of concurrent speakers in a room into individual streams and identifying their precise 2D locations is challenging, even for the human brain. Here, we present the first acoustic swarm that demonstrates cooperative navigation with centimeter-resolution using sound, eliminating the need for cameras or external infrastructure. Our acoustic swarm forms a self-distributing wireless microphone array, which, along with our attention-based neural network framework, lets us separate and localize concurrent human speakers in the 2D space, enabling speech zones. Our evaluations showed that the acoustic swarm could localize and separate 3-5 concurrent speech sources in real-world unseen reverberant environments with median and 90-percentile 2D errors of 15 cm and 50 cm, respectively. Our system enables applications like mute zones (parts of the room where sounds are muted), active zones (regions where sounds are captured), multi-conversation separation and location-aware interaction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Sep 21, 2023
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Creating speech zones with self-distributing acoustic swarms

Abstract

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

Decision letter: Two forms of asynchronous release with distinctive spatiotemporal dynamics in central synapses
Ege T Kavalali ... Lu Chen
-
Ege T Kavalali, et. al.Ege T Kavalali ... Lu Chen
28 Nov 2022
28 Nov 2022

Adapting UWB AoA estimation towards unseen environments using transfer learning and data augmentation
Mostafa Naseri ... Eli De Poorter
Internet of Things | VOL. 27
Mostafa Naseri, et. al.Mostafa Naseri ... Eli De Poorter
19 Jul 2024
Internet of Things | VOL. 27

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation
Yi Zhu ... Yutong Lu
-
Yi Zhu, et. al.Yi Zhu ... Yutong Lu
01 Oct 2021
01 Oct 2021

Nanoscaled RIM clustering at presynaptic active zones revealed by endogenous tagging.
Achmed Mrestani ... Sven Dannhäuser
Life Science Alliance | VOL. 6
Achmed Mrestani, et. al.Achmed Mrestani ... Sven Dannhäuser
11 Sep 2023
Life Science Alliance | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Creating speech zones with self-distributing acoustic swarms

Abstract

Talk to us

Similar Papers

More From: Nature Communications