Self-reported Language Proficiency Research Articles

Despite their pervasiveness, current text-based conversational agents (chatbots) are predominantly monolingual, while users are often multilingual. It is well-known that multilingual users mix languages while interacting with others, as well as in their interactions with computer systems (such as query formulation in text-/voice-based search interfaces and digital assistants). Linguists refer to this phenomenon as code-mixing or code-switching. Do multilingual users also prefer chatbots that can respond in a code-mixed language over those which cannot? In order to inform the design of chatbots for multilingual users, we conduct a mixed-method user-study (N=91) where we examine how conversational agents, that code-mix and reciprocate the users' mixing choices over multiple conversation turns, are evaluated and perceived by bilingual users. We design a human-in-the-loop chatbot with two different code-mixing policies -- (a) always code-mix irrespective of user behavior, and (b) nudge with subtle code-mixed cues and reciprocate only if the user, in turn, code-mixes. These two are contrasted with a monolingual chatbot that never code-mixed. Users are asked to interact with the bots, and provide ratings on perceived naturalness and personal preference. They are also asked open-ended questions around what they (dis)liked about the bots. Analysis of the chat logs, users' ratings, and qualitative responses reveal that multilingual users strongly prefer chatbots that can code-mix. We find that self-reported language proficiency is the strongest predictor of user preferences. Compared to the Always code-mix policy, Nudging emerges as a low-risk low-gain policy which is equally acceptable to all users. Nudging as a policy is further supported by the observation that users who rate the code-mixing bot higher typically tend to reciprocate the language mixing pattern of the bot. These findings present a first step towards developing conversational systems that are more human-like and engaging by virtue of adapting to the users' linguistic style.

Read full abstract

A recent Journal article1 found language preference was often unassociated, and English proficiency was consistently associated, with self-rated health—implying these should be modeled separately. A reexamination of this study suggests the reported conclusions depended on erroneous measures and highlights the importance of validity in linguistic studies. Two alternative linguistic indicators, which have stronger face validity, were examined alongside the self-reports used in the original study in a replication using the same National Latino and Asian American Study (NLAAS) data. Interviewers rated respondents’ English proficiency as non-English speaking, poor, fair, good, or excellent, and this was used to assess the criterion validity of self-reported language proficiency. Stronger preferences for English should be associated with conducting the survey in English; hence, the survey language measure was used to assess the predictive validity of self-reported language preference. A combined index was computed by standardizing interviewers’ reports of English proficiency and conducting the interview in English—so each item weighed equally in the final index—and then summing the resulting quotients (Cronbach α = 0.78). These items were then tested in separate models predicting the 5 categorizations of self-rated health, replicating the original model specification, using ordinal logistic regression. (A complete methodological appendix is available as a supplement to the online version of this article at http://www.ajph.org.) All interviewer-rated non-English speakers were misclassified; 75% reported they spoke poor, 18% fair, and 7% good English. This misclassification persisted where overlapping categorization between interviewer and self-reports existed. Among respondents with poor English proficiency, for example, 28% reported fair and 7% good English speaking proficiency. Overestimation tendencies also interfered with self-reported preferences. About 18% of non-English speakers reported using some English, 7% equal parts English and their native language, and 3% mostly English with friends. Respondents who reported speaking only English with friends versus mostly their native language had similar tendencies to interview in English (18% and 17%, respectively). The strong reliability originally reported with other self-reported measures for which no alternative existed, e.g., reading proficiency, suggests these may be similarly biased. Valerie Boyer has proposed legislation in France that would require all digitally altered photographs of people used in advertising to be labeled as retouched. Boyer, with a background in health administration and 2 adolescent daughters, became interested in the pressures on adolescents and young women to match the fashionable ideal of a thin body and perfect skin upon reflection as a mother. Photograph by Francois Guillot/Agence France-Presse. Printed with permission of Getty Images. Behavioral measures of English proficiency, preference, and their combined scale, contrary to the original study, had similar significant associations with self-rated health) Figure 1). For example, surveying in English versus surveying in another language was associated with a 13% (95% confidence interval [CI] = 10, 17) higher probability of excellent health and a corresponding 4% (95% CI = 2, 5) lower probability of poor health. Standardized coefficients were computed to compare the relative effect size across measures, given the variation in measurement units. English proficiency, (B = −0.57; 95% CI = −0.73, −0.42), preference, (B = −0.52; 95% CI = −0.71, −0.34), and their combined scale (B = −0.70; 95% CI = −0.88, −0.52), had statistically indistinguishable associations with self-rated health. FIGURE 1 Change in probability of self-rated health by English language differences in survey preference, proficiency, and their combined scale. It is well known that self-reports are susceptible to large systematic biases,2 yet many studies have relied on self-reported linguistic measures. Among the 15 observational studies in the Journal since 1999 that used acculturation, acculturative, or acculturated in their title or abstract 10 (67%)3–12 relied entirely on self-reported linguistic measures and only 3 (20%)13–15 included observed linguistic traits. Public health has been strongly critical of biomedical models16; however, when it comes to measures, a biomedical perspective in which the validity of survey measures are assumed appears common. The possible biases that interfered with the replicated study1 may similarly impact other studies, especially those that relied entirely on self-reported measures. The reasons are many, but in addition to being a health determinant, English may be perceived as relevant to social status and respondents will be tempted to overestimate their English traits. Self-reports may also be limited because of poor across-subject reliability, where self-reports are under- or overestimated relative to direct observation. It is admirable that Gee et al.1 have conducted research on measurement issues in acculturative studies, but their conclusions are questionable given the systematic error in their primary measures. Future research should instead focus on the validity of linguistic measures by refining-observational measures that overcome the limitations of self-reports. Studies focused on enhancing the reliability and validity of interviewer observed linguistic measures are needed. Given the deviations of self-reports from face valid alternatives, it is advisable to apply similar strategies, as explored here, to other studies.

Read full abstract

Self-reported Language Proficiency Research Articles

Related Topics

Articles published on Self-reported Language Proficiency

Role of Achievement Motivation and Metacognitive Strategies Use for Defining Self-Reported Language Proficiency

Clinical Neuropsychology of Bilingual Mexican American Adults: Effect of Language Proficiency and Dominance.

The Language of Subtitles for Arabic-English Bilingual Speakers in the United Arab Emirates

Language Discordance Between Students and Patients: Impact on Clinical Learning

Factors affecting immigrants’ host country language proficiency: Focusing on the differences between migrant workers and marriage-migrant women in South Korea

Do Multilingual Users Prefer Chat-bots that Code-mix? Let's Nudge and Find Out!

Language proficiency and usage among second- and third-generation Rohingya refugees in Mecca

EFL teacher burnout and self-assessed language proficiency: exploring possible relationships

A province-wide survey on self-reported language proficiency and its influence in global health education.

Language Barriers and Immigrant Health.

Language Barriers and Immigrant Health Production

Multilevel predictors of differing perceptions of Assessment for Learning practices between teachers and students

Lithuanian Saturday Schools in Chicago: Student Proficiency, Generational Shift, and Community Involvement

Impact of Language Proficiency Testing on Provider Use of Spanish for Clinical Care

Enriching Foreign Qualifications Through Canadian Post-secondary Education: Who Participates and Why?

“Does This Doctor Speak My Language?” Improving the Characterization of Physician Non‐English Language Skills

MEASURING ENGLISH PROFICIENCY AND LANGUAGE PREFERENCE: ARE SELF-REPORTS VALID?

Acculturation, Discrimination and Depressive Symptoms Among Korean Immigrants in New York City

Validating self-reported language proficiency by testing performance in an immigrant community: the Wellington Indo-Fijians

Validating self-reported language proficiency by testing performance in an immigrant community: the Wellington Indo-Fijians

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Self-reported Language Proficiency Research Articles

Related Topics

Articles published on Self-reported Language Proficiency

Role of Achievement Motivation and Metacognitive Strategies Use for Defining Self-Reported Language Proficiency

Clinical Neuropsychology of Bilingual Mexican American Adults: Effect of Language Proficiency and Dominance.

The Language of Subtitles for Arabic-English Bilingual Speakers in the United Arab Emirates

Language Discordance Between Students and Patients: Impact on Clinical Learning

Factors affecting immigrants’ host country language proficiency: Focusing on the differences between migrant workers and marriage-migrant women in South Korea

Do Multilingual Users Prefer Chat-bots that Code-mix? Let's Nudge and Find Out!

Language proficiency and usage among second- and third-generation Rohingya refugees in Mecca

EFL teacher burnout and self-assessed language proficiency: exploring possible relationships

A province-wide survey on self-reported language proficiency and its influence in global health education.

Language Barriers and Immigrant Health.

Language Barriers and Immigrant Health Production

Multilevel predictors of differing perceptions of Assessment for Learning practices between teachers and students

Lithuanian Saturday Schools in Chicago: Student Proficiency, Generational Shift, and Community Involvement

Impact of Language Proficiency Testing on Provider Use of Spanish for Clinical Care

Enriching Foreign Qualifications Through Canadian Post-secondary Education: Who Participates and Why?

“Does This Doctor Speak My Language?” Improving the Characterization of Physician Non‐English Language Skills

MEASURING ENGLISH PROFICIENCY AND LANGUAGE PREFERENCE: ARE SELF-REPORTS VALID?

Acculturation, Discrimination and Depressive Symptoms Among Korean Immigrants in New York City

Validating self-reported language proficiency by testing performance in an immigrant community: the Wellington Indo-Fijians

Validating self-reported language proficiency by testing performance in an immigrant community: the Wellington Indo-Fijians