Abstract
This paper presents a novel double-layer neighborhood graph index for acceleration of similarity search that accomplishes fast querybyexample spoken term detection (STD). When a query segment is given, our proposed STD method finds similar segments to the query from an utterance data set by efficient similarity search that traverses the double-layer neighborhood graph (DLG) with a low computational cost. The segment is a sequence of Gaussian mixture model posteriorgram frames and corresponds to a vertex in the DLG. A dissimilarity between vertices is measured by dynamic time warping. The DLG consists of two distinct degree-reduced k-nearest neighbor graphs in a base and an upper layer. The base layer's graph has all the vertices in the data set while the upper layer's graph includes only representatives extracted from the vertices in the base layer. By way of analogy, search in the DLG resembles driving on general roads and express highways appropriately for travel-time saving. Experimental results on the MIT lecture corpus demonstrate that the proposed method achieves CPU time reduction by 40% and more than 60% compared to the most recent method and the ordinary graphbased method, keeping almost the same precision.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.