Abstract
Teaching machine to comprehend a passage and answer corresponding questions, the machine reading comprehension (MRC) has attracted much attention in current years. However, most models are designed to finish English or Chinese MRC task, Considering lack of MRC dataset, the low-resource languages MRC tasks, such as Tibetan, it is hard to get high performance. To solve this problem, this paper constructs a span-style Tibetan MRC dataset named TibetanQA and proposes a hierarchical attention network model for Tibetan MRC task which includes word-level attention and re-read attention. And the experiments prove the effectiveness of our model.
Highlights
Machine reading comprehension (MRC) aims to teach machines to read and understand human language text
Large-scale open Tibetan machine reading comprehension (MRC) datasets, the relevant experiments cannot be carried out. This is the main factor that hinders the development of Tibetan MRC. 2) Compared to English MRC, word segmentation tools for Tibetan are under developing
This paper proposes an end-to-end model for Tibetan MRC
Summary
Machine reading comprehension (MRC) aims to teach machines to read and understand human language text. Many Chinese and English machine reading comprehension datasets have emerged, such as: SQuAD [1], MCTest [2], MS-MARCO [3], Du-Reader Dataset [4] etc Following these datasets, many models have been proposed, such as S-Net [5], AS Reader [6], IA Reader [7] etc. This is the main factor that hinders the development of Tibetan MRC. It needs the MRC model to strengthen its understanding
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have