Abstract
Semantic change is attested commonly in the historical development of lexicons across the world's languages. Extensive research has sought to characterize regularity in semantic change, but existing studies have typically relied on manual approaches or the analysis of a restricted set of languages. We present a large-scale computational analysis to explore regular patterns in word meaning change shared across many languages. We focus on two levels of analysis: (1) regularity in directionality, which we explore by inferring the historical direction of semantic change between a source meaning and a target meaning; (2) regularity in source-target mapping, which we explore by inferring the target meaning given a source meaning. We work with DatSemShift, the world's largest public database of semantic change that records thousands of meaning changes from over hundreds of languages. For directionality inference, we find that concreteness explains directionality in more than 70% of the attested cases of semantic change and is the strongest predictor among the alternatives including frequency and valence. For target inference, we find that a parallelogram-style analogy model based on contextual embeddings predicts the attested source-target mappings substantially better than chance and similarity-based models. Clustering the meaning pairs of semantic change reveals regular meaning shiftings between domains, such as body parts to geological formations. Our study provides an automated approach and large-scale evidence for multifaceted regularity in semantic change across languages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.