Abstract

Dynamic time warping (DTW) can be used to compute the similarity between two sequences of generally differing length. We propose a modification to DTW that performs individual and independent pairwise alignment of feature trajectories. The modified technique, termed feature trajectory dynamic time warping (FTDTW), is applied as a similarity measure in the agglomerative hierarchical clustering of speech segments. Experiments using MFCC and PLP parametrisations extracted from TIMIT and from the Spoken Arabic Digit Dataset (SADD) show consistent and statistically significant improvements in the quality of the resulting clusters in terms of F-measure and normalised mutual information (NMI).

Highlights

  • Dynamic time warping (DTW) is a method of optimally aligning two distinct time series of generally different length

  • The direct approach to keyword spotting has recently been extended by training a convolutional neural network (CNN) to emulate the template matching performed by DTW, thereby providing a substantial computational advantage [8, 9]

  • We describe a modification of DTW and demonstrate its improved performance when used as a similarity measure to cluster speech segments

Read more

Summary

Introduction

Dynamic time warping (DTW) is a method of optimally aligning two distinct time series of generally different length. In addition to the alignment, DTW computes a score indicating the similarity of the two sequences. This ability to quantify the similarity between time series has led to the application of DTW in automatic speech recognition (ASR) systems several decades ago [1, 2]. It has remained popular in this field, with more recent developments reported in [3] and [4]. The direct approach to keyword spotting has recently been extended by training a convolutional neural network (CNN) to emulate the template matching performed by DTW, thereby providing a substantial computational advantage [8, 9]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.