Abstract

The occurrences of graphemes in a text are generally determined by Zipf's law. In an attempt to develop a theoretical model for grapheme frequencies, Grzybek and Kelih have tested different distribution models and have come to the conclusion that rank frequency distribution for Slavic languages can be expressed in the form of the negative hypergeometric distribution. The application of this distribution to different corpora has led us to derive a functional relationship between ranks and letters of the English language alphabet and thus has formed a platform for the present study. In order to identify the patterns of letters in the corpus, we have applied group theoretical aspects and have observed that different rings are generated corresponding to ranks 1, 2 having values in the range 23–26, fields for ranks in ranges 3–9 and 10–22. Applications of these rings and fields reveal that frequency distribution can always be fitted by locally adopting an equation in the sets. It has led us to generate a general model for rank frequency distribution of English texts.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.