Generalized LR Research Articles

Phonotactic constraints can be employed to distinguish languages by representing a speech utterance as a multinomial distribution or phone events. In the present study, we propose a new learning mechanism based on subspace-based representation, which can extract concealed phonotactic structures from utterances, for language verification and dialect/accent identification. The framework mainly involves two successive parts. The first part involves subspace construction. Specifically, it decodes each utterance into a sequence of vectors filled with phone-posteriors and transforms the vector sequence into a linear orthogonal subspace based on low-rank matrix factorization or dynamic linear modeling. The second part involves subspace learning based on kernel machines, such as support vector machines and the newly developed subspace-based neural networks (SNNs). The input layer of SNNs is specifically designed for the sample represented by subspaces. The topology ensures that the same output can be derived from identical subspaces by modifying the conventional feed-forward pass to fit the mathematical definition of subspace similarity. Evaluated on the “General LR” test of NIST LRE 2007, the proposed method achieved up to 52%, 46%, 56%, and 27% relative reductions in equal error rates over the sequence-based PPR-LM, PPR-VSM, and PPR-IVEC methods and the lattice-based PPR-LM method, respectively. Furthermore, on the dialect/accent identification task of NIST LRE 2009, the SNN-based system performed better than the aforementioned four baseline methods.

Read full abstract

A language implementation with proper compositionality enables a compiler developer to divide-and-conquer the complexity of building a large language by constructing a set of smaller languages. Ideally, these small language implementations should be independent of each other such that they can be designed, implemented and debugged individually, and later be reused in different applications (e.g., building domain-specific languages). However, the language composition offered by several existing parser generators resides at the grammar level, which means all the grammar modules need to be composed together and all corresponding ambiguities have to be resolved before generating a single parser for the language. This produces tight coupling between grammar modules, which harms information hiding and affects independent development of language features. To address this problem, we have developed a novel parsing algorithm that we call Component-based LR (CLR) parsing, which provides code-level compositionality for language development by producing a separate parser for each grammar component. In addition to shift and reduce actions, the algorithm extends general LR parsing by introducing switch and return actions to empower the parsing action to jump from one parser to another. Our experimental evaluation demonstrates that CLR increases the comprehensibility, reusability, changeability and independent development ability of the language implementation. Moreover, the loose coupling among parser components enables CLR to describe grammars that contain LR parsing conflicts or require ambiguous token definitions, such as island grammars and embedded languages.

Read full abstract

Generalized LR Research Articles

Related Topics

Articles published on Generalized LR

Fast GLR parsers for extended BNF grammars and transition networks

Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition

Linear Programming with Triangular Fuzzy Numbers — A Case Study in a Finance and Credit Institute

Component-based LR parsing

Optimized GLR Parsing for Programming Languages

Even faster generalized LR parsing

Generation of LR parsers by partial evaluation

Intensive Determinants of Remote Masking

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Generalized LR Research Articles

Related Topics

Articles published on Generalized LR

Fast GLR parsers for extended BNF grammars and transition networks

Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition

Linear Programming with Triangular Fuzzy Numbers — A Case Study in a Finance and Credit Institute

Component-based LR parsing

Optimized GLR Parsing for Programming Languages

Even faster generalized LR parsing

Generation of LR parsers by partial evaluation

Intensive Determinants of Remote Masking