Penn Treebank-Based Syntactic Parsers for South Dravidian Languages using a Machine Learning Approach

K P Soman,Nandini J Warrier,P J Antony

doi:10.5120/1272-1789

Abstract

With the availability of limited electronic resources, development of a syntactic parser for all types of sentence forms is a challenging and demanding task for any natural language. This paper presents the development of Penn Treebank based statistical syntactic parsers for two South Dravidian languages namely Kannada and Malayalam. Syntactic parsing is the task of recognizing a sentence and assigning a syntactic structure to it. A syntactic parser is an essential tool used for various natural language processing (NLP) applications and natural language understanding. The well known grammar formalism called Penn Treebank structure was used to create the corpus for proposed statistical syntactic parsers. Both the parsing systems were trained using Treebank based corpus consists of 1,000 Kannada and Malayalam sentences that were carefully constructed. The developed corpus has been already annotated with correct segmentation and Part-Of-Speech (POS) information. We have used our own POS tagger generator for assigning proper tags to each and every word in the training and test sentences. The proposed syntactic parser was implemented using supervised machine learning and probabilistic context free grammars (PCFG) approaches. Training, testing and evaluations were done by support vector method (SVM) algorithms. From the experiment we found that the performance of our systems are significantly well and achieves a very competitive accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Penn Treebank-Based Syntactic Parsers for South Dravidian Languages using a Machine Learning Approach

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications

Lead the way for us

Journal: International Journal of Computer Applications	Publication Date: Oct 10, 2010
Citations: 10

Similar Papers

Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics | VOL. -
Junichi TsujiiJunichi Tsujii
07 Dec 2021
Computational Linguistics | VOL. -

The Application of Natural Language Processing and Automated Scoring in Second Language Assessment

-

22 Dec 2012
22 Dec 2012

A uniform computational model for natural language parsing and generation

-

01 Jan 1993
01 Jan 1993

Multi-dimensional dependency grammar as multigraph description
...
-
, et. al. ...
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Penn Treebank-Based Syntactic Parsers for South Dravidian Languages using a Machine Learning Approach

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications