Stanza Type Identification using Systematization of Versification System of Hindi Poetry

Milind Kumar Audichya,Jatinderkumar R

doi:10.14569/ijacsa.2021.0120117

Abstract

Poetry covers a vast part of the literature of any language. Similarly, Hindi poetry is also having a massive portion in Hindi literature. In Hindi poetry construction, it is necessary to take care of various verse writing rules. This paper focuses on the automatic metadata generation from such poems by computational linguistics integrated advance and systematic, prosody rule-based modeling and detection procedures specially designed for Hindi poetry. The paper covers various challenges and the best possible solutions for those challenges, describing the methodology to generate automatic metadata for “Chhand” based on the poems’ stanzas. It also provides some advanced information and techniques for metadata generation for “Muktak Chhands”. Rules of the “Chhands” incorporated in this research were identified, verified, and modeled as per the computational linguistics perspective the very first time, which required a lot of effort and time. In this research work, 111 different “Chhand” rules were found. This paper presents rule-based modeling of all of the “Chhands”. Out of the all modeled “Chhands” the research work covers 53 “Chhands” for which at least 20 to 277 examples were found and used for automatic processing of the data for metadata generation. For this research work, the automatic metadata generator processed 3120 UTF-8 based inputs of 53 Hindi “Chhand” types, achieved 95.02% overall accuracy, and the overall failure rate was 4.98%. The minimum time taken for the processing of “Chhand” for metadata generation was 1.12 seconds, and the maximum was 91.79 seconds.

Highlights

Hindi (‘ह दिं ी’) is known as a prevalent language
Saini and Kaur [19] did emotion detection-based research focusing on ‘Navrasa’ using machine learning algorithm Naïve Bayes (NB) and Support Vector Machine (SVM), SVM performed better with 70.02% overall accuracy
The research work is sufficient enough to help in the automatic generation of the metadata from the Hindi poetries by covering the majority of ‘Chhands’ already and having the capability to incorporate new kinds of ‘Chhands’ in a very systematic manner with ease

Summary

Introduction

Hindi (‘ह दिं ी’) is known as a prevalent language. According to India’s 2011 census, there were 322 million native speakers with Hindi as their first language [1]. The script is required to write any language. For the Hindi language, the writing script is Devanagari (‘दवे नागरी’), which is fourth in the world when it comes to the most widely adopted writing systems [2]. With the help of the Devanagari script, more than 120 languages are written all over the world. As per The Unicode Standard, Version 13.0, the Devanagari Unicode range is 0900–097F [3]

Methods

Results

Conclusion