Structure and Usage of the Tartu University Corpus of Written Estonian

T Hennoste,T Roosmaa,M Saluveer,Mare Koit

doi:10.1075/ijcl.3.2.06hen

Structure and Usage of the Tartu University Corpus of Written Estonian

T Hennoste, T Roosmaa + Show 2 more

https://doi.org/10.1075/ijcl.3.2.06hen

Copy DOI

Journal: International Journal of Corpus Linguistics	Publication Date: Jan 1, 1998
Citations: 4

Affiliation: University of Tartu

#Tartu University #Model Corpora + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper provides an overview of the first computer corpus of the Estonian language compiled at the University of Tartu. It was based on the design principles of the LOB and Brown corpora. The main part of the corpus was assembled from 1991-1995 and contains about 1 million textual words. It was compiled by an interdepartmental computational linguistics research group of the university. This paper gives a survey of the text groups in the corpus and of the problems the compilers had to solve together with the proposed solutions and outlines the main differences from the model corpora and the underlying reasons for them. These are followed by a review of the available computer routines for processing the corpus.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: International Journal of Corpus Linguistics

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.