Abstract

In this paper, we present a vocal suppression algorithm that can enhance the quality of music signal coded using Spatial Audio Object Coding (SAOC) in Karaoke mode. The residual vocal component in the coded music signal is estimated by using a cross prediction method in which the music signal coded in Karaoke mode is used as the primary input and the vocal signal coded in Solo mode is used as a reference. However, the signals are extracted from the same downmix signal and highly correlated, so that the music signal can be severely damaged by the cross prediction. To prevent this, a psycho-acoustic disturbance rule is proposed, in which the level of disturbance to the reference input of the cross prediction filter is adapted according to the auditory masking property. Objective and subjective test were performed and the results confirm that the proposed algorithm offers improved quality.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call