Abstract

Languages differ along multiple dimensions (lexis, phonology, morphology, syntax). Related languages descend from a common ancestor language but have diverged over time. This paper asks whether languages diverge equally along all dimensions, and, to the extent that they do not, which dimension reflects the traditional language family tree best. We computed measures of (i) lexical distance (ii) phonetic distance, and (iii) syntactic distance. The measures were computed on all words and sentences extracted from a corpus of translations of four relatively short English texts into another four Germanic languages (Danish, Dutch, German, Swedish), five Romance languages (French, Italian, Portuguese, Romanian, Spanish) and six Slavic languages (Bulgarian, Croatian, Czech, Polish, Slovakian, Slovenian). We examined the correlation structure of the distances for all pairs of Germanic (10), Romance (10) and Slavic (15) languages (i.e., within-family comparisons only). The results indicate that the linguistic dimensions are generally correlated (weakly but significantly), and that the correlations are stronger for pairs within families than when all 35 pairs are examined together. Cladistic family trees correlate best with the lexical distance (0.851 < r < 0.887). This confirms that the genealogical language trees are predominantly based on lexical rather than phonetic or syntactic considerations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call