Entropy-based syntactic tree analysis for text classification: a novel approach to distinguishing between original and translated Chinese texts

Zhongliang Wang,Andrew K F Cheung,Kanglong Liu

doi:10.1093/llc/fqae030

Abstract

Abstract This research focuses on classifying translated and non-translated Chinese texts by analyzing syntactic rule features, using an integrated approach of machine learning and entropy analysis. The methodology employs information entropy to gauge the complexity of syntactic rules in both text types. The methodology is based on the concept of information entropy, which serves as a quantitative measure for the complexity inherent in syntactic rules as manifested from tree-based annotations. The goal of the study is to explore whether translated Chinese texts demonstrate syntactic characteristics that are significantly different from those of non-translated texts, thereby permitting a reliable classification between the two. To do this, the research calculates information entropy values for syntactic rules in two comparable corpora, one of translated and the other of non-translated Chinese texts. Then, various machine learning models are applied to these entropy metrics to identify any significant differences between the two groups. The results show significant differences in the syntactic structures. Translated texts have a higher degree of entropy, indicating more complex syntactic constructs compared to non-translated texts. These findings contribute to our understanding of the effect of translation on language syntax, with implications for text classification and translation studies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Entropy-based syntactic tree analysis for text classification: a novel approach to distinguishing between original and translated Chinese texts

Abstract

Talk to us

Similar Papers

More From: Digital Scholarship in the Humanities

Lead the way for us

Journal: Digital Scholarship in the Humanities	Publication Date: Jun 5, 2024
Citations: 1

Similar Papers

Analysis of the Information Entropy on Traffic Flows
Zhiyuan Liu ... Hai Yang
IEEE Transactions on Intelligent Transportation Systems | VOL. 23
Zhiyuan Liu, et. al.Zhiyuan Liu ... Hai Yang
01 Oct 2022
IEEE Transactions on Intelligent Transportation Systems | VOL. 23

Uncertainties have a meaning: Information entropy as a quality measure for 3-D geological models
J Florian Wellmann ... Klaus Regenauer-Lieb
Tectonophysics | VOL. 526-529
J Florian Wellmann, et. al.J Florian Wellmann ... Klaus Regenauer-Lieb
08 May 2011
Tectonophysics | VOL. 526-529

Perspective Chapter: On Rolling Bearing Fault Feature Extraction Based on Entropy Feature
Yongjian Sun ... Zihan Wang
-
Yongjian Sun, et. al.Yongjian Sun ... Zihan Wang
14 Feb 2024
14 Feb 2024

Research on rolling bearing fault feature extraction based on entropy feature
Wang Zihan ... Sun Yong Jian
Annals of Mathematics and Physics | VOL. -
Wang Zihan, et. al.Wang Zihan ... Sun Yong Jian
16 Aug 2021
Annals of Mathematics and Physics | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Entropy-based syntactic tree analysis for text classification: a novel approach to distinguishing between original and translated Chinese texts

Abstract

Talk to us

Similar Papers

More From: Digital Scholarship in the Humanities