Abstract

Objective Limited studies related to spoken word corpus in the Indian context are available in the literature. To fulfill the demands of the spoken word frequency database in Hindi for advance psycholinguistic and cognitive studies, we tried to establish the preliminary spoken word database of Hindi language for children studying in Grade VI to Grade IX. Methods To create the spoken word corpus a recorder was given to subjects to record their conversation. The recorded sample was transcribed into Hindi text using voice note II software. The transcribed sample was uploaded into Text Analyzer software, and word frequency, the number of syllables, and lexical density were computed. Results Spoken word corpus consists of a total of 49,476 words. Lexical density was higher for females than males because the female database contains more unique words. The study also revealed that subjects used functional words and verbs more frequently, followed by nouns. Conclusion We can conclude that the current database provides information about the high-frequency and low-frequency words used by children studying in Grade VI to Grade IX. This database will be helpful in psycholinguistic and cognitive experiments; however, the present corpus included data from the middle socioeconomic group and contained fewer words. The present study is the preliminary study future study demands and requires an extensive word database.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call