CloudLM: a Cloud-based Language Model for Machine Translation

Jorge Ferrández-Tordera,Sergio Ortiz-Rojas,Antonio Toral

doi:10.1515/pralin-2016-0002

Jorge Ferrández-Tordera, Sergio Ortiz-Rojas + Show 1 more

Open Access

https://doi.org/10.1515/pralin-2016-0002

Copy DOI

Abstract

Abstract Language models (LMs) are an essential element in statistical approaches to natural language processing for tasks such as speech recognition and machine translation (MT). The advent of big data leads to the availability of massive amounts of data to build LMs, and in fact, for the most prominent languages, using current techniques and hardware, it is not feasible to train LMs with all the data available nowadays. At the same time, it has been shown that the more data is used for a LM the better the performance, e.g. for MT, without any indication yet of reaching a plateau. This paper presents CloudLM, an open-source cloud-based LM intended for MT, which allows to query distributed LMs. CloudLM relies on Apache Solr and provides the functionality of state-of-the-art language modelling (it builds upon KenLM), while allowing to query massive LMs (as the use of local memory is drastically reduced), at the expense of slower decoding speed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Prague Bulletin of Mathematical Linguistics	Publication Date: Apr 1, 2016
Citations: 2	License type: CC BY-NC-ND 3.0

R Discovery Prime

R Discovery Prime

CloudLM: a Cloud-based Language Model for Machine Translation

Abstract

Talk to us

Similar Papers

More From: The Prague Bulletin of Mathematical Linguistics

Lead the way for us

Similar Papers

Integration of Speech Recognition and Machine Translation in Computer-Assisted Translation
Shahram Khadivi ... Hermann Ney
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 16
Shahram Khadivi, et. al.Shahram Khadivi ... Hermann Ney
01 Nov 2008
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 16

The state of the art in language modeling
Joshua Goodman
-
Joshua GoodmanJoshua Goodman
01 Jan 2003
01 Jan 2003

Unsupervised segmentation of words into morphemes - morpho challenge 2005 application to automatic speech recognition
Mikko Kurimo ... Ebru Arsoy
-
Mikko Kurimo, et. al.Mikko Kurimo ... Ebru Arsoy
17 Sep 2006
Unsupervised segmentation of words into morphemes - morpho challenge 2005 application to automatic speech recognition
Mikko Kurimo ... Ebru Arsoy

QCRI's Live Speech Translation System
Fahim Dalvi ... Stephan Vogel
-
Fahim Dalvi, et. al.Fahim Dalvi ... Stephan Vogel
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CloudLM: a Cloud-based Language Model for Machine Translation

Abstract

Talk to us

Similar Papers

More From: The Prague Bulletin of Mathematical Linguistics