BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

Teakgyu Hong,Daehyun Nam,Wonseok Hwang,Sungrae Park,Mingi Ji,Donghyun Kim

doi:10.1609/aaai.v36i10.21322

Abstract

Key information extraction (KIE) from document images requires understanding the contextual and spatial semantics of texts in two-dimensional (2D) space. Many recent studies try to solve the task by developing pre-trained language models focusing on combining visual features from document images with texts and their layout. On the other hand, this paper tackles the problem by going back to the basic: effective combination of text and layout. Specifically, we propose a pre-trained language model, named BROS (BERT Relying On Spatiality), that encodes relative positions of texts in 2D space and learns from unlabeled documents with area-masking strategy. With this optimized training scheme for understanding texts in 2D space, BROS shows comparable or better performance compared to previous methods on four KIE benchmarks (FUNSD, SROIE*, CORD, and SciTSR) without relying on visual features. This paper also reveals two real-world challenges in KIE tasks--(1) minimizing the error from incorrect text ordering and (2) efficient learning from fewer downstream examples--and demonstrates the superiority of BROS over previous methods.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 43

Similar Papers

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
An Pha Le ... Thanh-Van Le
-
An Pha Le, et. al.An Pha Le ... Thanh-Van Le
16 Dec 2021
16 Dec 2021

A Multi-tasking and Multi-stage Chinese Minority Pre-trained Language Model
Bin Li ... Shutao Li
-
Bin Li, et. al.Bin Li ... Shutao Li
01 Jan 2021
01 Jan 2021

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Manjary P Gangan
-
Anoop K, et. al. Anoop K ... Manjary P Gangan
01 Jan 2021
01 Jan 2021

Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Mohamad Ballout ... Kai-Uwe Kühnberger
Procedia computer science | VOL. 222
Mohamad Ballout, et. al.Mohamad Ballout ... Kai-Uwe Kühnberger
01 Jan 2023
Procedia computer science | VOL. 222

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence