Abstract
HyperHamlet is a database of allusions to and quotations from Shakespeare's Hamlet, which is supported by the Swiss National Science Foundation as a joint venture between the Departments of English and German Philology, and the Image and Media Lab at the University of Basel. The compilation of a corpus, whose aim it is to document the "Shakespeare phenomenon", is intricate on more than one level: the desired transdisciplinary approach between linguistics, literary and cultural studies entails data selection from a vast variety of sources; the pragmatic nature of intertextual traces, i.e. their dependence on and subordination to new contexts, further adds to formal heterogeneity. This is not only a challenge for annotation, but also for data selection. As the recognition of intertextual traces is more often than not based on intuition, this paper analyses the criteria which underlie intuition so that it can be operationalised for scholarly corpus compilation. An analogue to the pragmatic model of ostensive-inferential communication with its three constitutive parts of speaker's meaning, sentence meaning and hearer's meaning has been used for analytical heuristics. Authorial intent – in a concrete as well as in an abstract historical sense – origin and specific encyclopaedic knowledge have been found to be the basic assumptions underlying data selection, while quantitative factors provide supporting evidence.
Highlights
1.1 The Corpus Shakespeare is generally said to have considerably contributed to the lexicon and phrase stock of the English language, yet so far the documentation of this truism has been more anecdotic than systematic
Our data as well as the requirements posed by our objective of systematisation for a searchable database have lead to the conclusion that marking devices can objectively be distinguished by identifiable overt linguistic marking strategies which, in turn, are attributable to prior authorial intent
The compilation of a corpus to document the "Shakespeare phenomenon" is intricate on more than one level: the desirable transdisciplinary approach entails data selection from a vast variety of sources; the pragmatic nature of intertextual traces further adds to formal heterogeneity, which is a challenge for annotation, and for data selection
Summary
1.1 The Corpus Shakespeare is generally said to have considerably contributed to the lexicon and phrase stock of the English language, yet so far the documentation of this truism has been more anecdotic than systematic. HyperHamlet is a paradox – it is both a specialised and a reference corpus It specialises in quotations and allusions, while admitting data from any language and period, from fiction and non-fiction, the visual arts and music, print and digital media, formal and informal settings. − type of reference, i.e. lexical, motif, name − language − year of composition − extend of intertextual overlap, e.g. noun phrase, adjective phrase, verb phrase, clause − modification type, e.g. substitution, omission, addition − text genre, e.g. fiction, non-fiction − text function, e.g. paratext, body of text − narrative function, e.g. dialogue, neutral narrator, real author − marking for author, work and quotation − intertextual relationship, e.g. intertextuality, hypertextuality, metatextuality These annotation features are in most cases further subcategorized so that the data can be adequately described: e.g. fiction > prose (drama/poetry) > romance (crime/fantasy/gothic/children's etc.), or paratext > stage direction (title/epigraph etc.). Selection criteria influence the quality of data and the motivation of annotative features in a data-driven approach to corpus compilation
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.