Abstract

In this work we consider Glissando Corpus—an oral corpus of Catalan and Spanish—and empirically analyze the presence of the four classical linguistic laws (Zipf’s law, Herdan’s law, Brevity law, and Menzerath–Altmann’s law) in oral communication, and further complement this with the analysis of two recently formulated laws: lognormality law and size-rank law. By aligning the acoustic signal of speech production with the speech transcriptions, we are able to measure and compare the agreement of each of these laws when measured in both physical and symbolic units. Our results show that these six laws are recovered in both languages but considerably more emphatically so when these are examined in physical units, hence reinforcing the so-called ‘physical hypothesis’ according to which linguistic laws might indeed have a physical origin and the patterns recovered in written texts would, therefore, be just a byproduct of the regularities already present in the acoustic signals of oral communication.

Highlights

  • Linguistic laws are statistical regularities and properties of linguistic elements which can be formulated mathematically and estimated quantitatively [1].While linguistic laws have been thoroughly studied over the last century [1,2,3], the debate on its ultimate origin is still open

  • For the phonetic inventory of Spanish and Catalan, note that only phonemes that appear effectively in the Glissando corpus have been taken into account, without considering other phonemes that could appear in other linguistic varieties of both languages [41]

  • Linguistic laws are again recovered with only slight differences with respect to English [4] and some technical details that are worth detailing for each law

Read more

Summary

Introduction

Linguistic laws are statistical regularities and properties of linguistic elements (i.e., phonemes, syllables, words or sentences) which can be formulated mathematically and estimated quantitatively [1].While linguistic laws have been thoroughly studied over the last century [1,2,3], the debate on its ultimate origin is still open. After some notable precursors (as Pareto [5], Estoup [6] or Condon [7] among others), George Kingsley Zipf formulated and explained in [8,9] one of the most popular quantitative linguistic observations known in his honor as Zipf’s Law. After some notable precursors (as Pareto [5], Estoup [6] or Condon [7] among others), George Kingsley Zipf formulated and explained in [8,9] one of the most popular quantitative linguistic observations known in his honor as Zipf’s Law He observed that the number of occurrences of words with a given rank can be expressed as f (r ) ∼ r −α , when ordering the words of written corpus in decreasing order by their frequency. This is a solid linguistic law proven in Entropy 2019, 21, 1153; doi:10.3390/e21121153 www.mdpi.com/journal/entropy

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.