Unsupervised speaker segmentation of multi-speaker speech data

Allen Louis Gorin

doi:10.1121/1.3074485

Unsupervised speaker segmentation of multi-speaker speech data

Allen Louis Gorin

https://doi.org/10.1121/1.3074485

Copy DOI

Journal: The Journal of The Acoustical Society of America	Publication Date: Jan 1, 2009
Citations: 1

#Speech Data #Speaker Segmentation + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Systems and methods for unsupervised segmentation of multi-speaker speech or audio data by speaker. A front-end analysis is applied to input speech data to obtain feature vectors. The speech data is initially segmented and then clustered into groups of segments that correspond to different speakers. The clusters are iteratively modeled and resegmented to obtain stable speaker segmentations. The overlap between segmentation sets is checked to ensure successful speaker segmentation. Overlapping segments are combined and remodeled and resegmented. Optionally, the speech data is processed to produce a segmentation lattice to maximize the overall segmentation likelihood.

Full Text