Abstract

String pattern matching is one of the oldest computational subdomains, used in diverse applications such as bioinformatics and natural language processing. Most mainstream languages provide only simple facilities for string processing culminating in regular expressions. Regular expressions are concise but lack power and flexibility, and usually comprise a disjoint sublanguage within a more mainstream language. Conicting design goals for string processing languages include: (1) concise expression of string patterns such as is provided by regular expressions, (2) expressive power and flexibility sufficient to handle complex and dynamic pattern contexts that cannot be matched by regular and context-free languages, and (3) peaceful co-existence and integration with ordinary (non-pattern) computational facilities. This paper describes the integration of regular expressions and SNOBOL patterns into the string scanning control structure in Unicon, a successor of the SNOBOL and Icon languages. Regular expressions compile into and mix with SNOBOL-style patterns. Unicon's matching operator is extended to execute a SNOBOL pattern match anchored to a scanning position; a new operator performs an unanchored match. Pattern matches can use string scanning expressions within patterns, supporting full bidirectional integration of the matching facilities.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.