Abstract

This study investigates how virtual head rotations can improve a binaural model's ability to segregate speech signals. The model takes two mixed speech sources spatialized to unique azimuth positions and localizes them. The model virtually rotates its head to orient itself for the maximum signal-to-noise ratio for extracting the target. An equalization-cancellation approach is used to generate a binary mask for the target based on localization cues. The mask is then overlaid onto the mixed signal's spectrogram to extract the target from the mixture. Improvement in signal-to-noise ratios from head rotation approaches over 30 dB.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call