Abstract

Chinese Spelling Check(CSC) is a crucial task aiming to detect and correct spelling errors in a text. Most previous methods focus on detecting and correcting the characterlevel errors. In this paper, we consider detecting spelling errors at the sentence-level and treat this problem as a binary classification. We propose a new paradigm to detect errors, which is a binary classifier based on the BERT. This classifier is fine-tuned with sentences containing errors as negative examples. We also propose a pair-shuffle training method to improve fine-tuning. Experimental <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> https://github.com/jiangjin1999/Sentence-level-detection-on-CSC results on the SIGAHN2015 dataset demonstrate that our paradigm and training method outperform the SOTA model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call