Abstract

A system for speaker-based audio indexing and for speaker tracking in broadcast news audio is presented. Several tasks which are treated as a multistage process construct the process of producing indexing information in continuous audio streams based on detected speakers. The main constructing blocks of such an indexing system contain components for an audio segmentation, speaker detection, speaker clustering, and speaker identification. In the proposed speaker-based audio indexing system, three probabilistic Linear Disciminant Analysis (PLDA) variants-standard, simplified and two-covariance-, and Gaussian Mixture Model (GMM) are proposed in the speaker identification stage. The evaluation is performed on audio data from the broadcast news domain and the obtained results demonstrate the superiority of two-covariance PLDA model in terms of performance results compared to other proposed algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call