Abstract

The aim of this paper is to report for the first time the 1000 most common words and lemmas of Modern Greek and some of their quantitative characteristics. The frequency word list produced is based on the Hellenic National Corpus (HNC), a corpus of Modern Greek language consisting of about 13 million words of written texts. In particular, we investigate the application of Zipf’s law in both the 1000 most common words and lemmas. In addition we examine the frequency distribution of the grammatical categories in the 1000 most common words and lemmas as well as the average word length in the whole HNC and the growth of the average word length as a function of the number of the most common words.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call