Abstract

Non-negative matrix factorization (NMF) has recently been applied to temporal decomposition (TD) of speech spectral envelopes represented by line spectral frequencies. A couple of inherent TD constraints, which are otherwise handled as ad hoc exceptions, has also been incorporated using NMF, including line spectral frequency (LSF) ordering and monotonic event functions. Here, these constraints are analyzed and a third inherent constraint is incorporated into an NMF analysis. This is the complementarity in the sense that two overlapping event functions uniformly add up to one, which has been handled, at best, by a quadratic penalty term. We propose the use of an augmented Lagrangian including a term with the Lagrange multipliers (LMs). Additionally, a multiplicative update rule for the LMs is proposed, which fits nicely into the nature of NMF updates. Further, previous difficulties with nonsmooth spectral envelopes have been resolved by obtaining the spectral envelopes from TANDEM-STRAIGHT spectrograms. Good results are reached at the tight event rate of 12.3 ev/s, featuring mean log-spectral distortions ranging from 1.2 dB to about 1.5 dB depending on the regularizations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.