It’s All in the Name: A Character-Based Approach to Infer Religion

Rochana Chaturvedi,Sugat Chaturvedi

doi:10.1017/pan.2023.6

Rochana Chaturvedi, Sugat Chaturvedi

Open Access

PDF Available

https://doi.org/10.1017/pan.2023.6

Copy DOI

Export

Save

Cite

Journal: Political Analysis	Publication Date: Mar 23, 2023
Citations: 6	License type: CC BY 4.0

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

AbstractLarge-scale microdata on group identity are critical for studies on identity politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia—where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method and, therefore, cannot classify unseen names. We provide character-based machine-learning models that can classify unseen names too with high accuracy. Our models are also much faster and, hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply these to infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation. Our approach can be used to detect identity groups across the world for whom the underlying names might have different linguistic roots.

Full Text