Analysis and Research Based on the Crowdsourcing Corpus System in Guangdong-Hong Kong-Macao Greater Bay Area (GBA)

Zheyu Zhu,Mingyang Xu,Ying Jiang,Jing Yang

doi:10.1155/2022/4815254

Abstract

Objective. To sort out the application and status quo of some domestic crowdsourcing models, explore the factors affecting multilingual manual annotation through experiments and offer suggestions. Methodology. Crawling the government news texts in Mandarin, Cantonese, English, and Portuguese in Guangdong, Hong Kong, and Macao, and enter them into the database. Combine it with corpus tagging uses the established web platform to practice crowdsourcing and collect a large number of annotation results and behavior data. Results. Made assumptions about factors that may affect the quality of manual annotation, used SPSS and other data analysis software to evaluate the degree of interpretation of the assumptions, provided a regression formula for calculating the accuracy, and provided constructive suggestions for the corpus annotation quality assurance projects. Limitations. More corpus information in more languages and more professional annotators are needed. Conclusions. The study found that the accuracy of annotation is strongly related to the attributes of the corpus itself, such as the total number of vocabularies, the number of rare words, the complexity of parts of speech, etc., and the condition of languages are different from each other.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analysis and Research Based on the Crowdsourcing Corpus System in Guangdong-Hong Kong-Macao Greater Bay Area (GBA)

Abstract

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Journal: Mathematical Problems in Engineering	Publication Date: Sep 30, 2022
License type: cc-by

Similar Papers

Evaluation of the Production-Education Integration Performance of the High-Tech Industry: An Empirical Comparison between Three Urban Agglomerations in China
Quan Zhang ... Tianzhen Tang
Discrete Dynamics in Nature and Society | VOL. 2021
Quan Zhang, et. al.Quan Zhang ... Tianzhen Tang
29 Sep 2021
Discrete Dynamics in Nature and Society | VOL. 2021

Position Enhancement and Development Strategies of Zhuhai's Economy in Guangdong-Hong Kong-Macao Greater Bay Area
Xiang Zhang
Journal of Coastal Research | VOL. 110
Xiang ZhangXiang Zhang
29 Sep 2020
Journal of Coastal Research | VOL. 110

Design of Intelligent Logistics Information Platform Based on Block Chain Technology : Research on urban integration development of Guangdong-Hong kong-Macao Greater Bay Area
Jieyu Lai ... Xiaoli Tan
-
Jieyu Lai, et. al.Jieyu Lai ... Xiaoli Tan
01 Sep 2020
01 Sep 2020

Spatio-temporal evolution and gravity center change of carbon emissions in the Guangdong-Hong Kong-Macao greater bay area and the influencing factors
Lei Li ... Shujie Sun
Heliyon | VOL. 9
Lei Li, et. al.Lei Li ... Shujie Sun
24 May 2023
Heliyon | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis and Research Based on the Crowdsourcing Corpus System in Guangdong-Hong Kong-Macao Greater Bay Area (GBA)

Abstract

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering