Abstract

Formal Grammar which is introduced by Chomsky is one of the most important development in Natural Language Processing, a branch of Artificial Intelligence. The mathematical reresentation of languages can be possible using Formal Grammars. Almost all natural languages have word classes such as noun, adjective, verb. In addition to this one sentence consist of noun phrase and verb phrase. Noun phrase may consist of location, destination and source elements. Despite many similarities between the languages, there exist important dissimilarities in grammar rules of the languages belonging to different language families. In our study the most appropriate formal grammar representing Turkish language is investigated. Accuracy of the suggested grammars’ rules is evaluated in two different corpus. This study is the enhanced version of “Turkish Context Free Grammar Rules with Case Suffix and Phrase Relation” that was presented on UBMK 2016 International Conference on Computer Science \& Engineering \cite{ilk}. Different from the first study, this study includes all word and sentence types of Turkish. Adjectives and prepositions are considered. The quoted sentences, incomplete sentences and question sentences are included. The genitive phrase structures including verbal word are included. In this study, the noun phrases are also defined in detail.

Highlights

  • The scientific studies about languages start at 1900’s

  • In first method to understand how appropriate the formal language is for the given language data we test each sentence if the sentence can be generated via using related rules in related Context free grammar (CFG)

  • If the sentence is not generated via using related rules in related CFG, we evaluate the score "False" for this sentence

Read more

Summary

Introduction

The scientific studies about languages start at 1900’s. The answer of what is natural language [2], [3], what are the features of natural language [4], [5], [6], can natural language be represented mathematically [15], [8], can we create a universal language [9] questions are searched. English is represented by Context free grammar (CFG) which is a type of formal grammar. Different CFG grammar rules are determined in different studies for English [14], [15]. There are studies related with CFG for English [17], [18]. The most appropriate Context Free Grammar and rules are searched for Turkish. Kuru used ATN for extracting Turkish suffixes [20] This is extensive study that includes different type of phrases with verbal items. Because the study does not consider the recursion in the sentence and type transformation. Our literature research shows that it is the only study using Combinatory Categorial Grammar (CCG) which is a kind of CFG. In this study the word type transformation and Turkish specific phrase types are not included.

Phrase structure with suffixes
Free phrase order
Compatibility between predicate and the other phrases
Transformation structure with suffixes
Recursion in sentences
CFG for simple sentence
CFG rules for noun phrase in simple sentence
CFG for complex sentence
CFG for compound sentence
Turkish CFG for incomplete sentence
Evaluation and Results
Discussion and Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call