Abstract
Total reduplication is common in natural language phonology and morphology. However, formally as copying on reduplicants of unbounded size, unrestricted total reduplication requires computational power beyond context-free, while other phonological and morphological patterns are regular, or even sub-regular. Thus, existing language classes characterizing reduplicated strings inevitably include typologically unattested context-free patterns, such as reversals. This paper extends regular languages to incorporate reduplication by introducing a new computational device: finite state buffered machine (FSBMs). We give its mathematical definitions and discuss some closure properties of the corresponding set of languages. As a result, the class of regular languages and languages derived from them through a copying mechanism is characterized. Suggested by previous literature, this class of languages should approach the characterization of natural language word sets.
Highlights
Formal language theory (FLT) provides computational mechanisms characterizing different classes of abstract languages based on their inherent structures
Several findings suggest that those four levels do not align with natural languages precisely, some leading to major refinements on the Chomsky Hierarchy (CH)
We analyze another mismatch between existing well-known language classes and empirical findings: reduplication, which involves copying operations on certain base forms (Inkelas and Zoll, 2005)
Summary
FSBMs are two-taped automata with finite-state core control. One tape stores the input, as in normal FSAs; the other serves as an unbounded memory buffer, storing reduplicants temporarily for future identity checking. The buffer interacts with the input in restricted ways: 1) the buffer is queue-like; 2) the buffer needs to work on the same alphabet as the input, unlike the stack in a pushdown automata (PDA), for example; 3) once one symbol is removed from the buffer, everything else must be wiped off before the buffer is available for other symbol addition. These restrictions together ensure the machine does not generate string reversals or other non-reduplicative non-regular patterns. Transitions between two H states check input-memory identity and consume symbols in both the input and the buffer. The language recognized by an FSBM M is denoted by L(M ). w ∈ L(M ) iff there’s a run of M on w
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.