Abstract

With the rapid development of 5G and machine learning, various innovative information systems continue to emerge. Among them, the public service information system (PSIS) has attracted extensive attention from researchers. However, the existing public service information system can only handle simple tasks and cannot answer complex questions raised by citizens. In this paper, we explore the application of machine reading comprehension to the field of Chinese public service information systems from scratch. Machine reading comprehension is a new research hotspot in the field of machine learning, which has great application potential in PSIS systems. However, work in this area is still lacking, with neither large scale datasets nor neural network models available. To address the above issue, first, we create a large scale machine reading comprehension dataset of Chinese public service information, including public service affair guidelines, policies, etc. Next, we propose several new neural network models which were continually pretrained on the Chinese public service corpus. The experimental results show that the proposed models achieve a significant improvement over several previous SOTA models on the new dataset. However, they are still far below human performance, indicating that the proposed dataset is challenging.

Highlights

  • In recent decades, governments across the globe have been facing numerous challenges in improving the efficiency and equity of public services. e growing pressure of public services has led to the government’s increasing attention to novel information technology such as machine learning

  • Machine reading comprehension can be applied in various public service information systems (PSIS) systems, including question answering bot (QABot), search engines, and dialogue systems, and it is possible to change the way of interaction between citizens and the government

  • We explore the application of machine reading comprehension to the Chinese public service from scratch

Read more

Summary

Introduction

Governments across the globe have been facing numerous challenges in improving the efficiency and equity of public services. e growing pressure of public services has led to the government’s increasing attention to novel information technology such as machine learning. Machine reading comprehension can be applied in various PSIS systems, including question answering bot (QABot), search engines, and dialogue systems, and it is possible to change the way of interaction between citizens and the government. (i) We created a new machine reading comprehension dataset of Chinese public service information. (ii) To evaluate the new dataset, we applied several previous SOTA models on the baseline Directly applying these models often yields suboptimal results due to the difference of word distribution between general domain corpus and public service corpus, which is especially obvious in Chinese. E experiment results show that the new models consistently outperform the previous SOTA models, but there is still a big gap compared with human performance, indicating that the proposed dataset is challenging (iv) We proposed the first domain-specific language models for Chinese public service, which were continually pretrained on our own corpus. e experiment results show that the new models consistently outperform the previous SOTA models, but there is still a big gap compared with human performance, indicating that the proposed dataset is challenging

Related Works
Task Definition and Evaluation Metrics
Dataset Construction and Description
Proposed Models
Evaluation
Details of Proposed Models
Experimental Results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.