Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion

Wai-Kit Lo,P C Ching,Helen Meng

doi:10.1145/964161.964162

Abstract

Cross-language spoken document retrieval (CL-SDR) is the technology that facilitates automatic retrieval of relevant information from a collection of spoken documents in a language that is different from that used in the queries. Information sources that are in different languages can then be retrieved automatically with CL-SDR, and the number of searchable information sources will increase significantly. The HMM-based retrieval model is a probabilistic formulation for the retrieval problem. Extensions to this retrieval model can be made by taking advantage of its probabilistic nature. Specifically, we have incorporated the translation component to make it possible to perform cross-language information retrieval (CLIR). In addition, this HMM-based CLIR retrieval model is also extended for retrieval at subword scales.In this work the extended HMM-based retrieval model has been applied to an English-Mandarin CL-SDR task, which is to search the Mandarin spoken document collection with English queries at word and subword scales. Retrieval results obtained from these indexing scales are then fused for multi-scale CL-SDR. Experimental results demonstrate that improvement in CL-SDR retrieval performance can be achieved by fusion of word and subword scales.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian Language Information Processing

Lead the way for us

Journal: ACM Transactions on Asian Language Information Processing	Publication Date: Mar 1, 2003
Citations: 7

Similar Papers

Adapting google translate for English-Persian cross-lingual information retrieval in medical domain
Amin Rahmani
-
Amin RahmaniAmin Rahmani
01 Oct 2017
01 Oct 2017

Translation-based Ranking in Cross-Language Information Retrieval

-

01 Jan 2015
01 Jan 2015

Evaluating Resource-Lean Cross-Lingual Embedding Models in Unsupervised Retrieval
Robert Litschko ... Laura Dietz
-
Robert Litschko, et. al.Robert Litschko ... Laura Dietz
18 Jul 2019
18 Jul 2019

Using learning to rank approach for parallel corpora based cross language information retrieval
...
-
, et. al. ...
13 Aug 2012
13 Aug 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian Language Information Processing