Abstract

We have studied the rank frequency distribution (RFD) of letters of the alphabet in Tamil language texts. In a novel application of rank frequencies, we have defined a simple intuitive distance parameter between a pair of strings (text or DNA sequence of codons). This distance correlates well with age difference in historical linguistics and evolutionary genetics. Using a distance matrix of a set of strings, we derive evolutionary trees that are broadly in agreement with historical evidence. The method has potential for refinement and application in evolutionary studies to complement other approaches to evolution. The RFD in a single string conforms to a law called the CMPL (Cumulative Modified Power Law), which we had formulated and applied to RFD's of diverse symbol sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call