Abstract
AbstractComputational linguistics is one of the attractive research topics in Natural Language Processing (NLP) and Artificial Intelligence domains. This paper presents a particular syntactical parsing technique on Kannada texts which is one of the South Indian languages. Cocke–Younger–Kasami (CYK) parsing technique has been adopted to parse Kannada sentences and identify their grammatical structure. Currently, very less NLP tools are available to parse several Indian languages. Hence, an effort has been made by us to efficiently parse the structure of the complex sentences in Kannada text using CYK algorithm. It is a bottom-up dynamic programming approach which functions only with the grammar in Chomsky normal form (CNF). We are giving annotated Kannada sentences as input to the parser model and obtaining the grammatical correctness of each sentence as the output. The parser model generates NLTK parse tree only for grammatically correct sentences in the output. The error messages have been displayed for incomplete and incorrect Kannada sentences. This work has been implemented and tested on 1000 annotated Kannada sentences. The input dataset contains both simple and compound sentences in which many sentences are 20–22 words in length. We obtained considerable results from the proposed syntax parser model which is implemented on Kannada text.KeywordsNatural language processingCYK algorithmSyntactical parsingChomsky normal formProduction rulesRule-based grammar
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.