Abstract
The manual annotation of large corpora is time-consuming and brings about issues of consistency. This paper aims to demonstrate how general rules for determining basic meanings can be formulated in large-scale projects involving multiple analysts applying MIP(VU) to authentic data. Three sets of problematic lexical units — chemical processes, colours, and sharp objects — are discussed in relation to the question of how the basic meaning of a lexical unit can be determined when human and non-human senses compete as candidates for the basic meaning; these analyses can therefore be considered a detailed case study of problems encountered during step 3.b. of MIP(VU). The analyses show how these problematic cases were tackled in a large corpus clean-up project in order to streamline the annotations and ensure a greater consistency of the corpus. In addition, this paper will point out how the formulation of general identification rules and guidelines could provide a first step towards the automatic detection of linguistic metaphors in natural discourse.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.