Abstract

Researchers of bilingual code-switching often assume that one of the participating languages serves as the ‘base’ or ‘matrix’ into which elements of the other language are embedded. However, the means by which the matrix language of a clause or extended discourse is determined remains much debated: Is has been variously associated with the numerical frequency of lemmas, with the predominant closed class or functional morphemes, or with the first language in a left-to- right parsing, oftentimes with contradictory results. The matrix language of “Being bilingue is mas sexy” would be either Spanish or English, depending on the language annotation of sexy; but it would be unambiguously English, as established by the gerund and copula or by its initial ordering in the surface string. Accurate identification of the matrix language for bilingual text or speech is important for linguists because it is proposed to be predictive of the grammatical constraints that are observed in code-switching. And, in natural language processing, detection of the matrix language can inform the selection of tools as researchers seek to analyze mixed-language data, which is ever increasing. This poster presentation demonstrates several metrics for easily quantifying and visualizing the matrix language, at various levels of analysis, in ways that are valid and replicable. The metrics were developed by the Bilingual Annotations Tasks (BATs) research group, an interdisciplinary cohort directed by Professors Bullock and Toribio and MA candidate Gualberto Guzman.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.