What's in a Name? A Method for Extracting Information about Ethnicity from Names

J Andrew Harris

doi:10.1093/pan/mpu038

Abstract

Questions about racial or ethnic group identity feature centrally in many social science theories, but detailed data on ethnic composition are often difficult to obtain, out of date, or otherwise unavailable. The proliferation of publicly available geocoded person names provides one potential source of such data'if researchers can effectively link names and group identity. This article examines that linkage and presents a methodology for estimating local ethnic or racial composition using the relationship between group membership and person names. Common approaches for linking names and identity groups perform poorly when estimating group proportions. I have developed a new method for estimating racial or ethnic composition from names which requires no classification of individual names. This method provides more accurate estimates than the standard approach and works in any context where person names contain information about group membership. Illustrations from two very different contexts are provided: the United States and the Republic of Kenya.

Highlights

Political scientists often consider theories about racial or ethnic identity at the local level, where detailed data on the ethnic or racial composition of the population are scarce (Hopkins 2010; Enos 2011; Kasara 2013).1 At the same time, large numbers of locally geo-coded person names are increasingly available
To provide a proof of concept, I begin in a context with copious information on names and racial demography: the United States
Based on King and Lu (2008) and Hopkins and King (2010), the method avoids individual classification of names in a list and instead focuses on modeling the proportions of each unique name in a list. This approach yields more efficient estimates of group proportions than approaches based on individual

Summary

Introduction

Political scientists often consider theories about racial or ethnic identity at the local level, where detailed data on the ethnic or racial composition of the population are scarce (Hopkins 2010; Enos 2011; Kasara 2013). At the same time, large numbers of locally geo-coded person names (e.g., voter registers or phone listings) are increasingly available. I apply the proposed method to names from the East African nation of Kenya, where existing direct measures of local ethnic composition (e.g., census or survey data) are, like many places in the developing world, unavailable or unsuitable for the research question. Based on King and Lu (2008) and Hopkins and King (2010), the method avoids individual classification of names in a list and instead focuses on modeling the proportions of each unique name in a list. This approach yields more efficient estimates of group proportions than approaches based on individual. Code to implement these methods is available in the online appendix and on the author’s website

Estimating Ethnic Proportions from Names

Key Assumption

Monte Carlo Simulations

Collection of Training Data

Application

Discussion

Findings

Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Political Analysis	Publication Date: Jan 1, 2015
Citations: 24	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

What's in a Name? A Method for Extracting Information about Ethnicity from Names

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Political Analysis

Lead the way for us

Similar Papers

Black Caribbean Immigrants in the United States and their Perceptions of Racial Discrimination: Understanding the Impact of Racial Identity, Ethnic Identity and Racial Socialization

-

01 Jan 2014
01 Jan 2014

Association of Neighborhood Racial and Ethnic Composition and Historical Redlining With Built Environment Indicators Derived From Street View Images in the US
Yukun Yang ... Elaine O Nsoesie
JAMA Network Open | VOL. 6
Yukun Yang, et. al.Yukun Yang ... Elaine O Nsoesie
18 Jan 2023
JAMA Network Open | VOL. 6

Veterans Affairs Medical Center Racial and Ethnic Composition and Initiation of Anticoagulation for Atrial Fibrillation
Utibe R Essien ... Michael J Fine
JAMA Network Open | VOL. 7
Utibe R Essien, et. al.Utibe R Essien ... Michael J Fine
24 Jun 2024
JAMA Network Open | VOL. 7

Mental Health Practitioners: The Relationship Between White Racial Identity Attitudes and Self-Reported Multicultural Counseling Competencies
Renée A Middleton ... Michele J Brown
Journal of Counseling & Development | VOL. 83
Renée A Middleton, et. al.Renée A Middleton ... Michele J Brown
01 Oct 2005
Journal of Counseling & Development | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

What's in a Name? A Method for Extracting Information about Ethnicity from Names

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Political Analysis