Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation

Abiola Obamuyide,Marina Fomicheva,Lucia Specia

doi:10.18653/v1/2021.repl4nlp-1.23

Abstract

Most current quality estimation (QE) models for machine translation are trained and evaluated in a fully supervised setting requiring significant quantities of labelled training data. However, obtaining labelled data can be both expensive and time-consuming. In addition, the test data that a deployed QE model would be exposed to may differ from its training data in significant ways. In particular, training samples are often labelled by one or a small set of annotators, whose perceptions of translation quality and needs may differ substantially from those of end-users, who will employ predictions in practice. Thus, it is desirable to be able to adapt QE models efficiently to new user data with limited supervision data. To address these challenges, we propose a Bayesian meta-learning approach for adapting QE models to the needs and preferences of each user with limited supervision. To enhance performance, we further propose an extension to a state-of-the-art Bayesian meta-learning approach which utilizes a matrix-valued kernel for Bayesian meta-learning of quality estimation. Experiments on data with varying number of users and language characteristics demonstrates that the proposed Bayesian meta-learning approach delivers improved predictive performance in both limited and full supervision settings.

Highlights

Quality Estimation (QE) models aim to evaluate the output of Machine Translation (MT) systems at run-time, when no reference translations are available (Blatz et al, 2004; Specia et al, 2009)
We further improve the performance of Bayesian meta-learning for the task of quality estimation by extending the state-of-the-art Bayesian Model-Agnostic MetaLearning (BMAML) approach of Kim et al (2018) to utilize Stein Variational Gradient Descent (Liu and Wang, 2016) with matrix-valued kernels (Wang et al, 2019), and demonstrate that this leads to enhanced predictive performance in both limited and full supervision settings
In this work we propose to improve the predictive performance of BMAML for quality estimation with the use of the Matrix-Stein Variational Gradient Descent (SVGD), which uses matrix-valued kernels for more effective parameter updates, in place of the original SVGD algorithm parameters are initialized from the model’s parameters, and updated with K steps of Matrix-SVGD (using Equations (2) and (4) to (7))

Summary

Introduction

Quality Estimation (QE) models aim to evaluate the output of Machine Translation (MT) systems at run-time, when no reference translations are available (Blatz et al, 2004; Specia et al, 2009). The perception of the quality of MT output can be subjective, and the quality estimates obtained from a model trained on data from one set of users may not serve the needs of a different set of users. Most existing QE models are trained and evaluated in a fully supervised setting which assumes access to substantial quantities of labelled supervision data, which may not be available and can be expensive and time-consuming to obtain. We further improve the performance of Bayesian meta-learning for the task of quality estimation by extending the state-of-the-art Bayesian Model-Agnostic MetaLearning (BMAML) approach of Kim et al (2018) to utilize Stein Variational Gradient Descent (Liu and Wang, 2016) with matrix-valued kernels (Wang et al, 2019), and demonstrate that this leads to enhanced predictive performance in both limited and full supervision settings

Model-Agnostic Meta-Learning

Stein Variational Gradient Descent

Stein Variational Gradient Descent with Matrix-Valued Kernels

Bayesian Model-Agnostic Meta-Learning

6: Sample DTvail from Tival

QE Model

Limited Supervision Results

Conclusions

Full Supervision Results

A Additional Experimental Details

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Better Quality Estimation for Low Resource Corpus Mining
Muhammed Kocyigit ... Derry Wijaya
-
Muhammed Kocyigit, et. al.Muhammed Kocyigit ... Derry Wijaya
01 Jan 2021
01 Jan 2021

Better Quality Estimation for Low Resource Corpus Mining
...
-
, et. al. ...
07 May 2022
07 May 2022

Towards Making the Most of Pre-trained Translation Model for Quality Estimation
Chunyou Li ... Hui Huang
-
Chunyou Li, et. al.Chunyou Li ... Hui Huang
01 Jan 2021
01 Jan 2021

Translation Error Detection as Rationale Extraction
...
-
, et. al. ...
11 May 2022
11 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers