Teaching Machines to Read and Comprehend Tibetan Text

Yuan Sun,Xiaobing Zhao,Zhengcuo Dan,Sisi Liu,Chaofan Chen

doi:10.4236/jcc.2021.99011

Abstract

Teaching machine to comprehend a passage and answer corresponding questions, the machine reading comprehension (MRC) has attracted much attention in current years. However, most models are designed to finish English or Chinese MRC task, Considering lack of MRC dataset, the low-resource languages MRC tasks, such as Tibetan, it is hard to get high performance. To solve this problem, this paper constructs a span-style Tibetan MRC dataset named TibetanQA and proposes a hierarchical attention network model for Tibetan MRC task which includes word-level attention and re-read attention. And the experiments prove the effectiveness of our model.

Highlights

Machine reading comprehension (MRC) aims to teach machines to read and understand human language text
Large-scale open Tibetan machine reading comprehension (MRC) datasets, the relevant experiments cannot be carried out. This is the main factor that hinders the development of Tibetan MRC. 2) Compared to English MRC, word segmentation tools for Tibetan are under developing
This paper proposes an end-to-end model for Tibetan MRC

Summary

Introduction

Machine reading comprehension (MRC) aims to teach machines to read and understand human language text. Many Chinese and English machine reading comprehension datasets have emerged, such as: SQuAD [1], MCTest [2], MS-MARCO [3], Du-Reader Dataset [4] etc Following these datasets, many models have been proposed, such as S-Net [5], AS Reader [6], IA Reader [7] etc. This is the main factor that hinders the development of Tibetan MRC. It needs the MRC model to strengthen its understanding

Related Work

Dataset Construction

Passage Collection

Question Construction

Answer Verification

Data Preprocessing

Input Embedding Layer

Word-Level Attention

Re-Read Attention

Output Layer

Dataset and Evaluation

Experiments on Different Models

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Teaching Machines to Read and Comprehend Tibetan Text

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer and Communications

Lead the way for us

Journal: Journal of Computer and Communications	Publication Date: Jan 1, 2021
License type: CC BY 4.0

Similar Papers

A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
Changchang Zeng ... Qin Li
Applied sciences | VOL. 10
Changchang Zeng, et. al.Changchang Zeng ... Qin Li
29 Oct 2020
Applied sciences | VOL. 10

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets
Changchang Zeng ... Balakrishnan Nagaraj
Wireless Communications and Mobile Computing | VOL. 2021
Changchang Zeng, et. al.Changchang Zeng ... Balakrishnan Nagaraj
23 Nov 2021
Wireless Communications and Mobile Computing | VOL. 2021

Multi-Task Deep Neural Networks for Multi-Document Reading Comprehension
Chang Liu ... Wayne Lin
-
Chang Liu, et. al.Chang Liu ... Wayne Lin
18 Jul 2021
18 Jul 2021

ViMRC - VLSP 2021: Context-Aware Answer Extraction in Vietnamese Question Answering
Thi Thu Hang Le
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 38
Thi Thu Hang LeThi Thu Hang Le
16 Dec 2022
ViMRC - VLSP 2021: Context-Aware Answer Extraction in Vietnamese Question Answering
Thi Thu Hang Le

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Teaching Machines to Read and Comprehend Tibetan Text

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer and Communications