Abstract
Along with increased multimedia information with spoken content, spoken document retrieval has attracted attention. It is useful to retrieve video scenes similar to the scenes currently being viewed. The time information of the video scene being viewed is specified by a user. Spoken content extracted from the video around the specified time is a search query. Time sections similar to spoken content around the specified time is output as the search result from all the videos to be retrieved. In this report, we will describe a methodology for evaluating this system, and report the results of the investigation about the relation between speech recognition performance and related spoken document retrieval performance. As a result of simulation experiments, it was confirmed that it is possible to influence the search performance in order of substituted error, deletion error, and insertion error, of the search both by question sentence and by scene. However, searching by scene is easily affected by speech recognition errors more so than by the search by question sentence. It was found that the influence of the insertion error when searching by scene was larger than when searching by question sentence.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have