Abstract

Abstract Kinship is an important issue in history studies. The kinship database is the key resource to analyze the structure, succession, and evolution of families. However, one kinship could be expressed by different words, and one kinship word may be vague and ambiguous in natural languages, especially in pre-modern Chinese. As in the well-known China Biographical Database, which contains 484,066 kinship instances, there are more than 400 kinship words. Thus, the relations extracted from history texts cannot be directly used to build family networks. In this article, we put forward a novel method to normalize kinship relations by three basic relations: father–descendant, mother–descendant, and husband–wife, as well as the gender of each person. All types of kinships are normalized to these three basic relations. In this way, we identified 178,390 basic kinship relations to fully describe the original 462,147 unambiguous kinship instances, while finding 3,989 inconsistencies and inferring 5,805 missing persons. Then, we generate 29,423 families by basic kinship relations and analyze the properties of families, such as their sizes, depths, and intermarriages across families. This type of family analysis had been almost impossible prior to normalizing kinship relations. Therefore, this technique enables improved family database construction and deeper quantitative analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call