A Tutorial on Evaluation Metrics used in Natural Language Generation

Mitesh M Khapra,Ananya B Sai

doi:10.18653/v1/2021.naacl-tutorials.4

Abstract

The advent of Deep Learning and the availability of large scale datasets has accelerated research on Natural Language Generation with a focus on newer tasks and better models. With such rapid progress, it is vital to assess the extent of scientific progress made and identify the areas/components that need improvement. To accomplish this in an automatic and reliable manner, the NLP community has actively pursued the development of automatic evaluation metrics. Especially in the last few years, there has been an increasing focus on evaluation metrics, with several criticisms of existing metrics and proposals for several new metrics. This tutorial presents the evolution of automatic evaluation metrics to their current state along with the emerging trends in this field by specifically addressing the following questions: (i) What makes NLG evaluation challenging? (ii) Why do we need automatic evaluation metrics? (iii) What are the existing automatic evaluation metrics and how can they be organised in a coherent taxonomy? (iv) What are the criticisms and shortcomings of existing metrics? (v) What are the possible future directions of research?

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Tutorial on Evaluation Metrics used in Natural Language Generation

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2021
Citations: 1	License type: cc-by

Similar Papers

A Survey on Evaluation Metrics for Machine Translation
Seungjun Lee ... Seonmin Koo
Mathematics | VOL. 11
Seungjun Lee, et. al.Seungjun Lee ... Seonmin Koo
16 Feb 2023
Mathematics | VOL. 11

Perturbation CheckLists for Evaluating NLG Evaluation Metrics
Ananya B Sai ... Sreyas Mohan
-
Ananya B Sai, et. al.Ananya B Sai ... Sreyas Mohan
01 Jan 2020
01 Jan 2020

Perturbation CheckLists for Evaluating NLG Evaluation Metrics
...
-
, et. al. ...
15 Oct 2021
15 Oct 2021

Comparison of template-based and multilayer perceptron-based approach for automatic question generation system
Walelign Tewabe Sewunetie ... László Kovács
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 28
Walelign Tewabe Sewunetie, et. al.Walelign Tewabe Sewunetie ... László Kovács
01 Dec 2022
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Tutorial on Evaluation Metrics used in Natural Language Generation

Abstract

Talk to us

Similar Papers