Abstract

Natural Language Processing is the multidisciplinary area of Artificial Intelligence, Machine Learning and Computational Linguistic for processing human language automatically. It involves understanding and processing of human language. The way through which we share our contents or feelings have always great importance in understanding and processing of language. Parsing is the most suited approach in identifying and scanning what the available sentences expressed? Parsing is the process in which syntactic structure of sentence is identified using grammatical tags. The syntactically correct sentence structure is achieved by assigning grammatical labels to its constituents using lexicon and syntactic rules. Phrase and Dependency are two main structure formalisms for parsing natural language sentences. The growing use of web 2.0 has produced novel research challenges as people from different geographical areas are using this channel and sharing contents in their native languages. Urdu is one of such free word order native language which is widely shared over social media sites but identification and summarization of Urdu sentences is challenging task. In this review paper we present an overview to recent work in parsing of fixed order (i.e. English) and free word order languages (i.e Urdu) in order to reveal the most suited method for Urdu Language Parsing. This survey explored that dependency parsing is more appropriate for Urdu and other free word order languages and parsers of English language are not useful in parsing Urdu sentence due to its morphological, syntactical and grammatical differences.

Highlights

  • A Review on Urdu Language ParsingAbstract—-Natural Language Processing is the multidisciplinary area of Artificial Intelligence, Machine Learning and Computational Linguistic for processing human language automatically

  • In linguistic, human behavior can be assessed by considering three key aspects; speaking, writing and communication

  • The syntactic structure of sentence can be described in two ways; Phrase Structure (PS) in which whole sentence is tokenized into constituents or phrases and a tree is generated as output shown below in fig.1 while the second way is Dependency Structure (DS) in which individual tokens are connected through links by ensuing dependency relations as shown in fig

Read more

Summary

A Review on Urdu Language Parsing

Abstract—-Natural Language Processing is the multidisciplinary area of Artificial Intelligence, Machine Learning and Computational Linguistic for processing human language automatically. In this review paper we present an overview to recent work in parsing of fixed order (i.e. English) and free word order languages (i.e Urdu) in order to reveal the most suited method for Urdu Language Parsing. This survey explored that dependency parsing is more appropriate for Urdu and other free word order languages and parsers of English language are not useful in parsing Urdu sentence due to its morphological, syntactical and grammatical differences

INTRODUCTION
URDU: A NOVEL CHALLENGE FOR NATURAL LANGUAGE PROCESSING
DEPENDENCY PARSING
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call