Abstract

The Indus Valley civilization thrived during its mature period between 2500 BCE and 1800 BCE and traded with civilizations of West Asia such as Mesopotamia and Dilmun through the Persian gulf. During this period, the Indus civilization developed a writing system now called the Indus script which is logosyllabic. It is unclear what language it encodes and is considered one of the biggest unsolved mysteries of historical linguistics. Researchers have collected Indus script texts from various sites in the Indian subcontinent and from West Asia and have built the Indus script text corpus. This research shows that several West Asian Indus texts in the corpus likely use a different language or syntax than the ones from the Indian subcontinent. We built various Markov chain language models based on n-grams from the Indus texts corpus. Using a best-fit language model, we calculated the model perplexity for each of the West Asian Indus texts. Our results show that the model perplexity was high for most West Asian Indus texts and that these texts did not fit in well with the language model built with Indus texts from just the Indian subcontinent. We conclude that the language and/or the syntax in the West Asian Indus texts are different from the Indus texts from the Indian subcontinent. We hope that this research and the statistical models we developed here will aid in quantifying the geographical difference in Indus scripts and contribute to the Indus script decipherment effort.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call