Abstract
Proper names of organisations are a special case of collective nouns. Their meaning can be conceptualised as a collective unit or as a plurality of persons, allowing for different morphological marking of coreferent anaphoric pronouns. This paper explores the variability of references to organisation names with 1) a corpus analysis and 2) two crowd-sourced story continuation experiments. The first shows that the preference for singular vs. plural conceptualisation is dependent on the level of formality of a text. In the second, we observe a strong preference for the plural they otherwise typical of informal speech. Using edited corpus data instead of constructed sentences as stimuli reduces this preference.
Highlights
Introduction experiments presented here addressEnglish only and serve as a pilot study for an investigation of reference to organisations across multiple languages.Via a corpus analysis of the OntoNotes corpus (Pradhan et al, 2013) and two crowd-sourced story continuation experiments, we study how organisational named entities are referenced after their introduction in a discourse
The expressions are categorised into four classes: repetition of the proper name, paraphrastic noun phrases with a common noun such as “the company”, and forms of the pronouns it and they
The names of organisations such as political bodies or companies are often made-up words (e. g., “Intel”, “Novartis”) or acronyms (e. g., “EU”, “Unesco”). They differ from other noun phrases in that they offer very little information about their grammatical properties such as number or, in languages where this is relevant, gender. Such names are a special case of the broader category of collective nouns, which includes common nouns such as “team” or “committee”, and they can be conceptualised in different ways by focusing on the collective as a singular unit or on the plurality of people which the organisation is comprised of
Summary
Introduction experiments presented here addressEnglish only and serve as a pilot study for an investigation of reference to organisations across multiple languages.Via a corpus analysis of the OntoNotes corpus (Pradhan et al, 2013) and two crowd-sourced story continuation experiments, we study how organisational named entities are referenced after their introduction in a discourse. They differ from other noun phrases in that they offer very little information about their grammatical properties such as number or, in languages where this is relevant, gender Such names are a special case of the broader category of collective nouns, which includes common nouns such as “team” or “committee”, and they can be conceptualised in different ways by focusing on the collective as a singular unit or on the plurality of people which the organisation is comprised of. When they occur as antecedents of referring expressions, names of organisations are a challenge for natural language processing (NLP) because they can trigger different types of morphological marking on the anaphoric elements.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.