Abstract

In large reference corpora representativeness is attempted through carefully selected sampling and sheer size. The situation is different with special language corpora in that their very nature limits them in size. Their representativity is measured by reference to external selection criteria, generally following bibliographic classifications, which tend to be subjective. In order to overcome subjectivity in specialised corpora, a corpus-directed system of internal selection using lexical criteria is proposed. The aim is not to create rigid boundaries but to see clearly what is actually present in the corpus. The method adopted is demonstrated on a corpus consisting of research articles from specialised journals and conference proceedings in the field of plant biology. Restricted collocational networks are used to isolate prototypical groupings within the corpus. It is shown that audience is an important factor in strong and weak prototypical groupings in theme and domain specific corpora. Articles addressing domain specialists through a journal tend to be more central than those presented to a theme-specific discourse community through conference proceedings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call