Abstract

This study constructed a computer scoring model for Chinese EFL learners’ English-to-Chinese translations using multidisciplinary techniques in corpus linguistics, natural language processing, information retrieval and statistics. The proposed model, once implemented as computer software, can score English-to-Chinese translations in large-scale examinations. This study built five tentative scoring models with 50, 100, 130, 150 and 180 translations as the training set for 300 translations of an expository writing. The correlation coefficients between the computed scores of these models and human-assigned scores were above 0.8. The results further indicated that the computed scores with 130 training translations were closest to human-assigned scores. Therefore, it was concluded that the text features extracted in this research were effective and the finalized model can produce reliable scores for Chinese EFL learners’ English-to-Chinese expository translations.

Highlights

  • Since the 1960s several automated essay scoring systems have been developed and applied to GRE, GMAT and other large-scale examinations [1, 2, 3]

  • A few researchers studied the automatic scoring of Chinese writing as well and found that computer-calculated scores by the use of latent semantic analysis (LSA) were close to human-rated scores [5]

  • This study addresses the following questions: (1) How much predicting power do the computer scoring models built based on different sizes of training sets have? How reliable are the predicted scores?

Read more

Summary

Introduction

Since the 1960s several automated essay scoring systems have been developed and applied to GRE, GMAT and other large-scale examinations [1, 2, 3]. The diagnostic model was composed of four types of modules that can evaluate the form and meaning of both text translation and sentence translation. They can provide learners with useful information about each module. The selective model can evaluate the semantic quality of text translation in large-scale tests. The conclusions of this study remain some uncertainties It only used 300 translated texts of a narration to build scoring models, while different types of texts have remarkable differences in content and language, so it is hard to determine whether the quality predictors will be effective for other text types. The study used a hold-out method, by which the training translations were only used for modeling and the validation translations were only utilized to test the models, so the results may be different if they switch roles [15]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.