Abstract

Packrat parsing is recently introduced technique based upon expression grammar. This parsing approach uses memoization and ensures a guarantee of linear parse time by avoiding redundant function calls by using memoization. This paper studies the progress made in packrat parsing till date and discusses the approaches to tackle this parsing process efficiently. In addition to this, other issues such as left recursion, error reporting also seems to be associated with this type of parsing approach and discussed here the efforts attempted by researchers to address this issue. This paper, therefore, presents a state of the art review of packrat parsing so that researchers can use this for further development of technology in an efficient manner.

Highlights

  • Parsing consists of two processes: lexical analysis and parsing

  • It is shown that parsing expression grammars (PEGs), similar to GTDPL and TS, can express all left-to-right leftmost derivation (LL)(k) and left-to-right rightmost derivation (LR)(k) languages, and that they can be parsed in linear time with the memoization technique [7]

  • Compressive review is taken about the packrat parsing introduced by Ford in 2002

Read more

Summary

INTRODUCTION

Parsing consists of two processes: lexical analysis and parsing. The job of the lexical analysis is to break down the input text (string) into smaller parts, called tokens. Memoization employed in the packrat parsing eliminates disadvantage of conventional topdown backtracking algorithms which suffer from exponential parsing time in the worst case This exponential runtime is due to performing redundant evaluations caused by backtracking. Packrat parsers avoid this by storing all of the evaluated results to be used for future backtracking eliminating redundant computations This storing technique is called memoization which ensures guaranteed linear parsing time for packrat parsers. It can even parse some grammars that are non-context-free [7] Another characteristic of packrat parsing is that it is scannerless i.e. a separate lexical analyzer is not needed. Due to deterministic nature of resulting grammar they discovered that the parsing results could be saved in a table to avoid redundant computations This approach was never put into practice, due to the limited amount of main memory in computers at that time [10,14]. The paper focuses upon the open problems in packrat parsing and concluded with future work

PARSING EXPRESSION GRAMMAR
Definitions and Operators
Ambiguity
Left Recursion
Syntatic Predicates
Memoization
Scannerless
LITERATURE SURVEY
Maintaining States
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call