詞向量的語意變遷計算模型: 以「家」為例

陳蓓怡

doi:10.6342/ntu202004346

Abstract

This research aims to investigate the topic of historical semantic change from the perspective of quantitative and computational linguistics. With a rapid accumulation of texts in the digital era, attention is called upon a more temporal-aware interpretation of language use and meaning construction. Meanwhile, the digitalization of historical texts opens up more research opportunities to trace the diachronic development of words and meanings. Especially, semantic change motivated by linguistic features and factors can be explored in a data-driven approach. Language is a means of communication through which ideas are conveyed, stored, and recorded, and in essence, constant change and evolution occurs as the speakers use the language with the passage of time (Blank, 1999: 61). The dynamics of meaning construction is embodied in the emergence and loss of senses, as well as the split and shifts, which contributes to the different distributions and interactions of words, reflects the regularities and adaptability of the language, and the cognition and culture operating behind (Blank, 1999: 63). Synchronic variations can be dealt with through a diachronic lens. Corpus-based, data-driven approach enables an observation and derived generalizations of semantic change. Coupled with the advances in vector space models and statistical analysis, the changes in meaning are explored. Polysemy is a driving force of semantic change. Concepts and meanings are structured in words and language use, and how word-formation is realized in Chinese is addressed in the development of monosyllabic to disyllabic words, which not only allows us to explore the influence of homophony, the interaction between words, and the growth of disyllabic words and compounds. Seeing that historical textual data are in demand, computational semantics and statistical models resolves the dilemmas. On top of that, it is possible that semantic change occurs not in observed frequency, but other distributional ways, making the encoded meanings distinctively different from previous time periods. As distributed models like word embeddings are receiving much attention, historical semantic change is a research topic that should enter the discussions. In the field of corpus linguistics, such research method are based on co-occurrences of words in context, and the cooccurrence distribution represents the similarities and differences in meaning interactions. The diachronic corpus consists of texts from the following sources: the Chinese Text Project (Sturgeon, 2019) and Academia Sinica Balanced Corpus of Modern Chinese for modern Chinese (Chen et al., 1996). By applying a quantitative inquiry into semantic change, we will measure the degrees of semantic change, support known change cases, and discover unknown ones, with the consultation of lexical databases. Firstly, the global measures proposed by Hamilton et al. (2016a) is adopted. Second-order embeddings comprised of similarity scores of keywords are formed to compare the meaning representations of different eras. The lower the correlation between two temporally-adjacent vectors, the higher the degrees of semantic change. Secondly, based on the distribution and interaction of a word’s senses, the semantic trajectories of the word will be traced. Finally, this study will proceed with periodization analysis using the Variability-based Neighbor Clustering (VNC) method (Gries and Hilpert, 2012). As a hierarchical clustering method, it is bottom-up, as opposite to the decisive clustering, a comprehensive evaluation of the influence of the selected linguistic factors in this study is implemented to explore how the development of meaning construction can be understood under different stages. In sum, this study explores the phenomenon of semantic change in retrospect to derive the semantic development in diachrony. The computational/statistical modeling of historical lexical semantic change will shed new light on how the language community describes and makes sense of the society that is also constantly changing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

詞向量的語意變遷計算模型: 以「家」為例

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Analyzing semantic shifts in English and German by exploring historical influences and societal dynamics
Wahyu Kurniati Asri ... Muftihaturrahmah Burhamzah
Studies in English Language and Education | VOL. 11
Wahyu Kurniati Asri, et. al.Wahyu Kurniati Asri ... Muftihaturrahmah Burhamzah
12 Jun 2024
Studies in English Language and Education | VOL. 11

Semantic Change of Hijab, Halal and Islamist from Arabic to English
Amjaad Omar Almarwaey ... Ummul K Ahmad
3L The Southeast Asian Journal of English Language Studies | VOL. 27
Amjaad Omar Almarwaey, et. al.Amjaad Omar Almarwaey ... Ummul K Ahmad
29 Jun 2021
3L The Southeast Asian Journal of English Language Studies | VOL. 27

Knowledge-enhanced temporal word embedding for diachronic semantic change estimation
J Vijayarani ... T V Geetha
Soft Computing | VOL. 24
J Vijayarani, et. al.J Vijayarani ... T V Geetha
15 Feb 2020
Soft Computing | VOL. 24

An Analysis of How Language Usage Show Systematic Pattern of Variation Based on Linguistics and Social Factors
Reni Safira ... Putri Bayat
VISA: Journal of Vision and Ideas | VOL. 4
Reni Safira, et. al. Reni Safira ... Putri Bayat
16 Jan 2024
VISA: Journal of Vision and Ideas | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

詞向量的語意變遷計算模型: 以「家」為例

Abstract

Talk to us

Similar Papers