Acoustic beamforming is a technique used in audio engineering and signal processing to filter sound waves in the spatial domain. Usually, an array of microphones is used to capture sound from different directions. These array signals are then combined to reinforce each other in the desired direction while cancelling the noise or interference from other directions. An important application of acoustic beamforming is the cocktail party scenario where multiple people speak simultaneously in a noisy room. To capture the speech of only the desired speaker, acoustic beamforming is used. Usually, in such cases, the target speech is modelled as a zero-mean complex Gaussian distributed random variable. However, the target speech coefficients are sparse in the time-frequency domain. Hence, in our work, we model the speech coefficients using a zero-mean circular Laplacian distribution. After modelling the target speech, we formulate a beamformer based on the maximum likelihood criteria. We add a distortionless constraint to the proposed beamformer to further improve performance. The final solution of the proposed beamformer encourages sparsity, indicating that it models the target speech better than the complex Gaussian distribution. Simulations show the effectiveness of the proposed beamformer in capturing the target speech and rejecting interfering speakers.
Read full abstract