On the Structural Disambiguation of Multi-word Terms

Melania Cabezas-García,Pilar León-Araúz

doi:10.1007/978-3-030-30135-4_4

Abstract

Multi-word terms pose many challenges in Natural Language Processing (NLP) because of their structure ambiguity. Although the structural disambiguation of multi-word expressions, also known as bracketing, has been widely studied, no definitive solution has as yet been found. Although linguists, terminologists, and translators must deal with bracketing problems, they generally must resolve problems without using advanced NLP systems. This paper describes a series of manual steps for the bracketing of multi-word terms (MWTs) based on their linguistic properties and recent advances in NLP. After analyzing 100 three- and four-term combinations, a set of criteria for MWT bracketing was devised and arranged in a step-by-step protocol based on frequency and reliability. Also presented is a case study that illustrates the procedure.

Full Text