An Unsupervised Domain-Adaptive Framework for Chinese Spelling Checking

Xi Wang,Piji Li,Jing Li,Ruoqing Zhao

doi:10.1145/3689821

Abstract

Chinese Spelling Check (CSC) is a meaningful task in the area of Natural Language Processing (NLP), which aims at detecting spelling errors in Chinese texts and then correcting these errors. Current typical Chinese Spelling Check models have shown impressive performance in general datasets with the help of pretrained language models such as BERT, but suffer great perform loss in downstream tasks with domain-specific terms because they are primarily trained on general corpora. To verify the cross-domain adaptation ability of these models, we build three new datasets with abundant domain-specific terms on financial, medical, and legal domains and conduct empirical investigations on them in the corresponding domain-specific test datasets to verify the cross-domain adaptation ability. In response to the poor performance of the existing models, we propose a framework named uChecker which utilizes unsupervised method in spelling error detection and correction. Experiment results prove that uChecker can perform well in domain-specific test datasets while not losing its performance in the general domain.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Unsupervised Domain-Adaptive Framework for Chinese Spelling Checking

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Similar Papers

Is Chinese Spelling Check ready? Understanding the correction behavior in real-world scenarios
Liner Yang ... Erhong Yang
AI Open | VOL. 4
Liner Yang, et. al.Liner Yang ... Erhong Yang
01 Jan 2023
AI Open | VOL. 4

Visual and Phonological Feature Enhanced Siamese BERT for Chinese Spelling Error Correction
Yujia Liu ... Hongliang Guo
Applied Sciences | VOL. 12
Yujia Liu, et. al.Yujia Liu ... Hongliang Guo
30 Apr 2022
Applied Sciences | VOL. 12

A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check
Dingmin Wang ... Jing Li
-
Dingmin Wang, et. al.Dingmin Wang ... Jing Li
01 Jan 2018
01 Jan 2018

IME-Spell
Qingbiao Zhao ... Xingfa Shen
-
Qingbiao Zhao, et. al.Qingbiao Zhao ... Xingfa Shen
18 Dec 2020
18 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Unsupervised Domain-Adaptive Framework for Chinese Spelling Checking

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing