A Large-Scale Chinese Multimodal NER Dataset with Speech Clues

Dianbo Sui

doi:10.48448/7hqr-qb88

Abstract

In this paper, we aim to explore an uncharted territory, which is Chinese multimodal named entity recognition (NER) with both textual and acoustic contents. To achieve this, we construct a large-scale human-annotated Chinese multimodal NER dataset, named \texttt{CNERTA}. Our corpus totally contains 42,987 annotated sentences accompanying by 71 hours of speech data. Based on this dataset, we propose a family of strong and representative baseline models, which can leverage textual features or multimodal features. Upon these baselines, to capture the natural monotonic alignment between the textual modality and the acoustic modality, we further propose a simple multimodal multitask model by introducing a speech-to-text alignment auxiliary task. Through extensive experiments, we observe that: (1) Progressive performance boosts as we move from unimodal to multimodal, verifying the necessity of integrating speech clues into Chinese NER. (2) Our proposed model yields state-of-the-art (SoTA) results on \texttt{CNERTA}, demonstrating its effectiveness. For further research, the annotated dataset is publicly available at \url{http://github.com/DianboWork/CNERTA}.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Large-Scale Chinese Multimodal NER Dataset with Speech Clues

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

FE-CFNER: Feature Enhancement-based approach for Chinese Few-shot Named Entity Recognition
Sanhe Yang ... Yilei Wang
Computer Speech & Language | VOL. 90
Sanhe Yang, et. al.Sanhe Yang ... Yilei Wang
09 Oct 2024
Computer Speech & Language | VOL. 90

Chinese Medical Named Entity Recognition Using External Knowledge
Lin Zhang ... Ruiqing Wang
-
Lin Zhang, et. al.Lin Zhang ... Ruiqing Wang
01 Jan 2021
01 Jan 2021

HiNER: Hierarchical feature fusion for Chinese named entity recognition
Shuxiang Hou ... Mengnan Ma
Neurocomputing | VOL. 611
Shuxiang Hou, et. al.Shuxiang Hou ... Mengnan Ma
05 Oct 2024
Neurocomputing | VOL. 611

Named entity recognition of local adverse drug reactions in Xinjiang based on transfer learning
Keming Kang ... Shengwei Tian
Journal of Intelligent & Fuzzy Systems | VOL. 40
Keming Kang, et. al.Keming Kang ... Shengwei Tian
01 Jan 2020
Journal of Intelligent & Fuzzy Systems | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Large-Scale Chinese Multimodal NER Dataset with Speech Clues

Abstract

Talk to us

Similar Papers