인공지능 기반 한문 번역을 위한 코퍼스 추출 및 정제 과정

Byeong-Gu Jeon

doi:10.37736/kjlr.2023.08.14.4.02

Abstract

It is fun to imagine how good it would be to be able to translate a lot of Chinese classics by using AI in an era where the field of AI is expanding day by day. This is why commercialization of machine translation called artificial intelligence can drastically reduce the manpower and time spent on classical Chinese translation. To this end, various institutions are improving their performance by developing artificial intelligence Chinese character recognition and Chinese character translation programs. AI-based Chinese character translation, research is concentrated in the field of technology, and there are no reported studies on the corpus extraction process and purification process. For deep learning, which trains artificial intelligence, automatic translation data, that is, corpus that connects Chinese text and translations in parallel, must be created. In order to make a corpus, a lot of corpus data is extracted through human translation, and a high-quality corpus is made through refining based on detailed guidelines. Here, we tried to confirm the purification process of how the extracted corpus data were selected. As a result of checking the corpus purification process, words such as various titles, items, and names, metrological units or number of people were all excluded. Short sentences containing place names, human names, government posts, places, products, dates, and proper nouns were also removed because they were not appropriate as corpus. In addition, prefixes consisting of one or two letters, such as adverbs, conjunctions, tense, and pronounciation at the beginning of the sentence, were deleted. In the future, it is expected that the time and expense of translation will be greatly reduced if extraction is carried out in consideration of these contents in the construction of corpus for AI-based Chinese character translation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

인공지능 기반 한문 번역을 위한 코퍼스 추출 및 정제 과정

Abstract

Talk to us

Similar Papers

More From: Korean Association for Literacy

Lead the way for us

Similar Papers

김치의 중국어 번역에 대한 提言 - 어원 沈菜를 대역어로 활용하는 새로운 방안
Jun-Soo Kim ... Yuening Seow
The Journal of Chinese Language and Literature | VOL. 130
Jun-Soo Kim, et. al.Jun-Soo Kim ... Yuening Seow
31 Oct 2021
The Journal of Chinese Language and Literature | VOL. 130

The Evolution of Artificial Intelligence: From Assistance to Super Mind of Artificial General Intelligence? Article 2. Artificial Intelligence: Terra Incognita or Controlled Force?
Leonid Grinin ... Igor Grinin
Social Evolution & History | VOL. 23
Leonid Grinin, et. al.Leonid Grinin ... Igor Grinin
30 Sep 2024
Social Evolution & History | VOL. 23

Measuring Similarity between Transliterations by Character Pronunciation
Chung-Chian Hsu ... Chun-Kai Chen
-
Chung-Chian Hsu, et. al.Chung-Chian Hsu ... Chun-Kai Chen
01 Oct 2006
01 Oct 2006

Artificial intelligence and human translation: A contrastive study based on legal texts
Ahmed Mohammed Moneus ... Yousef Sahari
Heliyon | VOL. 10
Ahmed Mohammed Moneus, et. al.Ahmed Mohammed Moneus ... Yousef Sahari
01 Mar 2024
Heliyon | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

인공지능 기반 한문 번역을 위한 코퍼스 추출 및 정제 과정

Abstract

Talk to us

Similar Papers

More From: Korean Association for Literacy