Abstract

This paper describes recent studies on speaker diarization from automatic broadcast news transcripts. Linguistic information revealing the true names of who speaks during a broadcast (the next, the previous and the current speaker) is detected by means of linguistic patterns. In order to associate the true speaker names with the speech segments, a set of rules are defined for each pattern. Since the effectiveness of linguistic patterns for diarization depends on the quality of the transcription, the performance using automatic transcripts generated with an LVCSR system are compared with those obtained using manual transcriptions. On about 150 hours of broadcast news data (295 shows) the global ratio of false identity association is about 13% for the automatic and the manual transcripts

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call