Abstract

Recently, several studies have reported promising results with BERT-like methods on acronym tasks. In this study, we find an older rule-based program, Ab3P, not only performs better, but error analysis suggests why. There is a well-known spelling convention in acronyms where each letter in the short form (SF) refers to “salient” letters in the long form (LF). The error analysis uses decision trees and logistic regression to show that there is an opportunity for many pre-trained models (BERT, T5, BioBert, BART, ERNIE) to take advantage of this spelling convention.

Highlights

  • Deep net models such as BERT (Devlin et al (2019)), GPT (Brown et al (2020)), ELMo (Peters et al (2018)), ERNIE (Sun et al (2020)), T5 (Raffel et al (2020)), BioBert (Lee et al (2020)), BART (Lewis et al (2020)) have achieved record-setting performance on a wide range of tasks

  • This charmatch feature is easier to implement than the more general spelling constraint involving salient characters, since it can be tricky to define what counts as “salient.” Given the crux mentioned above, the first character is helpful for identifying the left edge of the long form (LF), and the left edge addresses the bulk of the opportunity

  • The Ab3P system is better than deep nets on the ADI task, in terms of precision and recall, and in terms of speed, memory and ease of use

Read more

Summary

INTRODUCTION

Deep net models such as BERT (Devlin et al (2019)), GPT (Brown et al (2020)), ELMo (Peters et al (2018)), ERNIE (Sun et al (2020)), T5 (Raffel et al (2020)), BioBert (Lee et al (2020)), BART (Lewis et al (2020)) have achieved record-setting performance on a wide range of tasks. It is standard practice in end-to-end machine learning to emphasize certain types of evidence and de-emphasize other types of evidence. Acronyms are a special case of multiword expressions (MWEs) (Krovetz et al (2011)). It is common in certain types of technical writing to abbreviate compounds (and MWEs) with acronyms.

WHAT DO WE MEAN BY MULTIWORD EXPRESSIONS?
ACRONYM TASKS
LONG-DISTANCE DEPENDENCIES
Standard Benchmarks for ADI Systems
Acronyms From arXiv
RELATED WORK
BERT-SQUAD
ERROR ANALYSIS
Charmatch
Using Decision Trees in Error Analysis
Document-Level Context
Constraints Within and Across Documents
Findings
10 CONCLUSION
DATA AVAILABILITY STATEMENT
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.