Abstract

Teaching machine to comprehend a passage and answer corresponding questions, the machine reading comprehension (MRC) has attracted much attention in current years. However, most models are designed to finish English or Chinese MRC task, Considering lack of MRC dataset, the low-resource languages MRC tasks, such as Tibetan, it is hard to get high performance. To solve this problem, this paper constructs a span-style Tibetan MRC dataset named TibetanQA and proposes a hierarchical attention network model for Tibetan MRC task which includes word-level attention and re-read attention. And the experiments prove the effectiveness of our model.

Highlights

  • Machine reading comprehension (MRC) aims to teach machines to read and understand human language text

  • Large-scale open Tibetan machine reading comprehension (MRC) datasets, the relevant experiments cannot be carried out. This is the main factor that hinders the development of Tibetan MRC. 2) Compared to English MRC, word segmentation tools for Tibetan are under developing

  • This paper proposes an end-to-end model for Tibetan MRC

Read more

Summary

Introduction

Machine reading comprehension (MRC) aims to teach machines to read and understand human language text. Many Chinese and English machine reading comprehension datasets have emerged, such as: SQuAD [1], MCTest [2], MS-MARCO [3], Du-Reader Dataset [4] etc Following these datasets, many models have been proposed, such as S-Net [5], AS Reader [6], IA Reader [7] etc. This is the main factor that hinders the development of Tibetan MRC. It needs the MRC model to strengthen its understanding

Related Work
Dataset Construction
Passage Collection
Question Construction
Answer Verification
Data Preprocessing
Input Embedding Layer
Word-Level Attention
Re-Read Attention
Output Layer
Dataset and Evaluation
Experiments on Different Models
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call