Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Goran GlavaŠ,Swapna Somasundaran

doi:10.1609/aaai.v34i05.6284

Abstract

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model – a neural architecture consisting of two hierarchically connected Transformer networks – is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 10

Similar Papers

Tipster: A Topic-Guided Language Model for Topic-Aware Text Segmentation
Zheng Gong ... Wei Huang
-
Zheng Gong, et. al.Zheng Gong ... Wei Huang
01 Jan 2021
01 Jan 2021

Multi-population genomic prediction using a multi-task Bayesian learning model.
Liuhong Chen ... Stephen Miller
BMC genetics | VOL. 15
Liuhong Chen, et. al.Liuhong Chen ... Stephen Miller
01 Jan 2014
BMC genetics | VOL. 15

Faceted Text Segmentation via Multitask Learning.
Bei Wu ... Jun Liu
IEEE transactions on neural networks and learning systems | VOL. 32
Bei Wu, et. al.Bei Wu ... Jun Liu
07 Sep 2020
IEEE transactions on neural networks and learning systems | VOL. 32

Text structure and comprehension

-

10 Jan 2005
10 Jan 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence