Abstract

This paper presents several techniques for managing ambiguity in LFG parsing of Wolof, a less-resourced Niger-Congo language. Ambiguity is pervasive in Wolof and This raises a number of theoretical and practical issues for managing ambiguity associated with different objectives. From a theoretical perspective, the main aim is to design a large-scale grammar for Wolof that is able to make linguistically motivated disambiguation decisions, and to find appropriate ways of controlling ambiguity at important interface representations. The practical aim is to develop disambiguation strategies to improve the performance of the grammar in terms of efficiency, robustness and coverage.To achieve these goals, different avenues are explored to manage ambiguity in the Wolof grammar, including the formal encoding of noun class indeterminacy, lexical specifications, the use of Constraint Grammar models (Karlsson 1990) for morphological disambiguation, the application of the c-structure pruning mechanism (Cahill et al. 2007, 2008; Crouch et al. 2013), and the use of optimality marks for preferences (Frank et al. 1998, 2001). The parsing system is further controlled by packing ambiguities. In addition, discriminant-based techniques for parse disambiguation (Rosén et al. 2007) are applied for treebanking purposes.

Highlights

  • This paper deals with the ambiguity problem in the process of analyzing texts in Wolof, a less-resourced language.1 it reports on several techniques used to manage ambiguity in a broad-coverage computational grammar and parser for Wolof

  • The grammar is implemented in the linguistic framework of Lexical Functional Grammar (LFG) (Kaplan and Bresnan 1982) using the Xerox Linguistic Environment (XLE) (Crouch et al 2013)

  • The ambiguity phenomenon is perhaps the most serious problem faced by natural language processing (NLP) systems, and this is true for many reasons

Read more

Summary

University of Bergen abstract

This paper presents several techniques for managing ambiguity in LFG parsing of Wolof, a less-resourced Niger-Congo language. Due to the lack of resources, there is a very limited possibility to apply statistical approaches that often require a large data set to ensure reliable results To address these different research questions and to decide among the alternative ways of managing ambiguity, this work is based on three main premises. I will attempt to show how the application of the different disambiguation techniques discussed in this paper helps to manage ambiguity and to reduce parse time in the process of analyzing texts in Wolof. Note that the purpose is not to give an exhaustive account of all the disambiguation methods used within this research work or to provide an exhaustive overview of their systematic interaction but to illustrate ambiguity management in LFG parsing of Wolof focusing on some example constructions which present particular challenges for grammar development and treebanking work for the language. Local ambiguity includes the following ambiguities discussed in Section (2.2): lexical, morphological, and syntactic ambiguities that are resolved when a larger sentential context is taken into account

Types of ambiguity
Morphological and lexical ambiguity
Ambiguity due to Wolof noun classes
Syntactic ambiguity
Structural ambiguity
Ambiguity resolution for Wolof noun classes
Coping with POS ambiguity
PP flies
With CG
None None
Using optimality marks
Candidate B
Handling coordination ambiguity
Ambiguity packing in XLE
Removing spurious ambiguities
Discriminant Type
Findings
Drop in parsing accuracy
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.