Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Export
Sort by: Relevance
  • Front Matter
  • 10.3366/cor.2025.0342
Front matter
  • Nov 1, 2025
  • Corpora

  • Research Article
  • 10.3366/cor.2025.0343
‘Maidan has become part of Ukrainian identity’: the dynamics of naming and framing civil resistance in parliamentary discourse
  • Nov 1, 2025
  • Corpora
  • Anna Kryvenko

This paper examines how civil resistance is constructed in parliamentary discourse, focussing on naming choices for the 2013–14 protests in Ukraine known as the Euromaidan, Revolution of Dignity, or Maidan, and their surrounding contexts in speeches by Ukrainian Members of Parliament and foreign guests in full-house sittings of the Ukrainian parliament from 2013 to 2023. Using metadata annotation in the newly created corpus of Ukrainian parliamentary proceedings under the ParlaMint project, the study explores the interplay between naming and framing the protests over time and across collective actors at different levels of data aggregation. The results indicate a decline in explicit references to the 2013–14 protests in Ukrainian parliamentary discourse, but each name in question follows its unique trajectory, showing variations in relative frequency and semantic preference. The study also discusses the limits of using these names interchangeably, considering their non-arbitrariness and word-building potential within the context of competing framings of the events by different political players.

  • Research Article
  • 10.3366/cor.2025.0348
The English Teacher Corpus: a novel approach to learner corpus development
  • Nov 1, 2025
  • Corpora
  • Tomáš Gráf + 2 more

This study introduces the English Teacher Corpus (etc), delineating its development, technical parameters, and research prospects. This spoken learner corpus contains spontaneous and semi-spontaneous speech tasks performed by Czech teachers of English as a foreign language (efl). The tasks include a monologue, dialogue, picture-based narrative, reading-aloud assignment, and an interview conducted in the teacher’s L1. Complementing this corpus is a reference counterpart featuring native English teachers based in the Czech Republic, mirroring the etc’s task design. In its 12.5 hours of recorded and transcribed text, the etc consists of 76,122 tokens for the L2 and 31,898 tokens for the L1 sub-corpus. The corpus has been partly transcribed by Whisper AI and subsequently aligned using exmaralda. The etc marks a pioneering effort as the first spoken learner corpus produced by efl teachers. Its innovation extends beyond its content, as it gave rise to a developmental and pedagogical project within a university teacher-training programme.

  • Research Article
  • 10.3366/cor.2025.0344
Lexical bundles in L1 and L2 English thesis abstracts: a corpus-driven move approach
  • Nov 1, 2025
  • Corpora
  • Martina Jarkovská + 1 more

Lexical bundles are considered to be building blocks of written academic genres, including abstracts. Whilst the focus has primarily been on lexical bundles in abstracts of higher-level writing, little attention has been given to abstracts of lower-level theses. This study aimed to uncover differences between types and frequencies of lexical bundles and their structural patterns and textual roles within rhetorical moves in abstracts of students’ theses written in L1 and L2 English. In addition to the two novice writers’ corpora, we built a third corpus containing abstracts from research papers by L1 expert writers. Results revealed parallels and differences between L1 and L2 novice writing, and between novice and expert writing. L2 novice writers showed nearly over-cautious dependence on formulaic language, repeating the same patterns and functions and frequently failing to achieve their communicative purpose. Despite the similarities between the two types of novice writing, L1 novice writers showed similarities with L1 experts. A pedagogical implication is encouraging more diverse grammatical patterns and textual functions, which would result in a more accurate portrayal of the research conducted.

  • Research Article
  • 10.3366/cor.2025.0347
<i>Climate change</i> and <i>Harvard students</i> : English noun sequences and their German and Swedish correspondences
  • Nov 1, 2025
  • Corpora
  • Jenny Ström Herold + 1 more

This study explores English noun sequences such as climate change, with a common noun modifier, and Harvard students, with a proper noun modifier, contrasting German and Swedish. The material is provided by the Linnaeus University English–German–Swedish corpus (legs), a multi-directional 5-million word non-fiction corpus. The results show that the most common type of translation correspondence – regardless of translation direction – is the German and Swedish (solid) compound noun ( world war &gt; Weltkrieg/ världskrig). When specifically focussing on English proper noun modifiers, it is, however, evident that these are less likely to produce compound nouns in translations, due to language-internal preferences in German and Swedish. Apart from the formal properties of correspondences, this study also takes semantics into account. We show that some types of semantic relations between the head and its modifying noun, such as Composition, which identifies the material of the head noun ( silk cloth), are more likely to be rendered as compound nouns in German and Swedish. Amongst the non-compound correspondences in German and Swedish, post-modifying prepositional phrases are one of the more prominent alternatives ( climate signal &gt; signal från [‘from’] klimatet [Swedish]). This result is in line with our previous findings (Ström Herold and Levin, 2019 ; and Levin and Ström Herold, 2024 ), suggesting that Swedish, more than German, favours post-modification. Amongst the notable translation effects, we observe how translators sometimes make the content more explicit through the addition of a noun, but also that the opposite applies.

  • Research Article
  • 10.3366/cor.2025.0349
R <scp>eview</scp> : Baker. 2023. <i>Using Corpora in Discourse Analysis</i> . (Second edition.) London: Bloomsbury
  • Nov 1, 2025
  • Corpora
  • Gaoqiang Lu + 1 more

  • Research Article
  • 10.3366/cor.2025.0346
Collocations in downsampled corpora
  • Nov 1, 2025
  • Corpora
  • Antti Kanner

For many purposes, downsampling and using downsampled corpora are popular ways of extracting research data from larger text collections. Whilst most benefits of downsampling are gained from using it as a tool for qualitative inspection, it is not uncommon that the use of concordance or downsampled corpora is extended to collocation extraction. This is done with the underlying assumption that differences between concordance corpus or downsampled corpus and full corpus are trivial in this regard. This paper reports on results from an analysis, where this assumption was specifically tested. The results show, that whilst there can often be a relatively high degree of agreement between the two methods, they cannot be relied upon to produce correlating rank-orders or overlapping top collocate lists in every case. Further, the difference was more marked in the case of content words in contrast to function words.

  • Research Article
  • 10.3366/cor.2025.0345
Building the <scp>avatar</scp> Therapy Dialogues Corpus: the process of constructing a longitudinal corpus of three-way psychotherapeutic interactions
  • Nov 1, 2025
  • Corpora
  • Sinéad Jackson + 15 more

avatar therapy is an innovative form of relational therapy for the treatment of distressing auditory verbal hallucinations, or voice-hearing, targeted at reducing voice-related distress. avatar therapy involves the creation of a digital simulation of a single voice, termed an ‘avatar’, which is used in a series of three-way therapeutic dialogues. This paper presents the avatar Therapy Dialogues Corpus, a specialised corpus containing orthographic transcriptions of avatar therapy sessions. We offer an overview of the corpus contents, and a detailed discussion of the design and construction of the corpus. We describe the processes and specialised tools created, transcription conventions, and mark-up designed to capture para-linguistic and non-speech features which may have clinical relevance. Finally, we discuss the potential of the corpus to provide a genuine innovation in clinical care, offering clinicians a data stream that could augment their understanding of patient experiences.

  • Front Matter
  • 10.3366/cor.2025.0350
Back matter
  • Nov 1, 2025
  • Corpora

  • Research Article
  • 10.3366/cor.2025.0336
Keywords and key emoji: investigating a university’s Twitter posts before, during and after <scp>covid</scp>-related restrictions
  • Aug 1, 2025
  • Corpora
  • Luke C Collins + 1 more

Many universities use social media to communicate and engage with stakeholders, including students and staff. In recent years, universities were also faced with navigating the challenges resulting from the covid-19 global pandemic and related restrictive measures that disrupted routine operations. In this paper, we examine a case study of a UK University and its posts on Twitter (now X) prior to, during and following the period of restrictive measures. With a focus on features of the ‘Conversational Human Voice’ ( Kelleher, 2009 ), we report keywords and key emoji in a corpus of Twitter posts between 2018 and 2022. We demonstrate that despite the disruption of the pandemic and restrictive measures, the University maintained a consistent strategy, capitalising on the timeliness and broadcast functions of the platform to celebrate activities of its personnel and promote local events. Furthermore, we demonstrate how emoji and other paralinguistic elements can be incorporated into a multi-modal corpus analysis.