Abstract

In recent years, Chinese has become one of the most popular languages globally. The demand for automatic Chinese sentence correction has gradually increased. This research can be adopted to Chinese language learning to reduce the cost of learning and feedback time, and help writers check for wrong words. The traditional way to do Chinese sentence correction is to check if the word exists in the predefined dictionary. However, this kind of method cannot deal with semantic error. As deep learning becomes popular, an artificial neural network can be applied to understand the sentence’s context to correct the semantic error. However, there are still many issues that need to be discussed. For example, the accuracy and the computation time required to correct a sentence are still lacking, so maybe it is still not the time to adopt the deep learning based Chinese sentence correction system to large-scale commercial applications. Our goal is to obtain a model with better accuracy and computation time. Combining recurrent neural network and Bidirectional Encoder Representations from Transformers (BERT), a recently popular model, known for its high performance and slow inference speed, we introduce a hybrid model which can be applied to Chinese sentence correction, improving the accuracy and also the inference speed. Among the results, BERT-GRU has obtained the highest BLEU Score in all experiments. The inference speed of the transformer-based original model can be improved by 1131% in beam search decoding in the 128-word experiment, and greedy decoding can also be improved by 452%. The longer the sequence, the larger the improvement.

Highlights

  • As communication and transportation technology advances, communicating with foreigners is more common than it used to be

  • To speed up the inference speed of Chinese sentence correction, we introduce the hybrid model by combining the Bidirectional Encoder Representations from Transformers (BERT) and RNN-based models, which speed up the inference speed but still preserve the Transformer-based model’s performance

  • By inspecting the advantages and disadvantages of RNN and Transformer, we introduce a hybrid model with faster inference speed and better performance

Read more

Summary

Introduction

As communication and transportation technology advances, communicating with foreigners is more common than it used to be. Chinese has become one of the most popular languages. In a globalization era, speaking a few foreign languages is a common thing. It is sometimes hard to tell if our grammar is correct; it would be wonderful if there would be a system that can automatically correct sentences. The traditional way may correct sentences is with a predefined dictionary. This can be scaled up because of its low computational cost, it is challenging to correct semantic errors and grammatical errors without capturing sentence-level information

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.