A Pattern and POS Auto-Learning Method for Terminology Extraction from Scientific Text

Wei Shao,Bolin Hua,Linqi Song

doi:10.2478/dim-2021-0005

Wei Shao, Bolin Hua + Show 1 more

Open Access

https://doi.org/10.2478/dim-2021-0005

Copy DOI

Abstract

A lot of new scientific documents are being published on various platforms every day. It is more and more imperative to quickly and efficiently discover new words and meanings from these documents. However, most of the related works rely on labeled data, and it is quite difficult to deal with unlabeled new documents efficiently. For this, we have introduced an unsupervised method based on sentence patterns and part of speech (POS) sequences. Our method just needs a few initial learnable patterns to obtain the initial terminology tokens and their POS sequences. In this process, new patterns are constructed and can match more sentences to find more POS sequences of terminology. Finally, we use obtained POS sequences and sentence patterns to extract terminology terms in new scientific text. Experiments on paper abstracts from Web of Knowledge show that this method is practical and can achieve a good performance on our test data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data and Information Management	Publication Date: Jul 1, 2021
Citations: 4	License type: CC BY-NC-ND 3.0

R Discovery Prime

R Discovery Prime

A Pattern and POS Auto-Learning Method for Terminology Extraction from Scientific Text

Abstract

Talk to us

Similar Papers

More From: Data and Information Management

Lead the way for us

Similar Papers

Some approaches to translation of professional terms abbreviation in materials science
Natalya Sigacheva ... Diliara Gainanova
E3S Web of Conferences | VOL. 274
Natalya Sigacheva, et. al.Natalya Sigacheva ... Diliara Gainanova
01 Jan 2020
E3S Web of Conferences | VOL. 274

InaNLP: Indonesia natural language processing toolkit, case study: Complaint tweet classification
Ayu Purwarianti ... Irfan Afif
-
Ayu Purwarianti, et. al.Ayu Purwarianti ... Irfan Afif
01 Aug 2016
01 Aug 2016

Terminological Information Extraction from Russian Scientific Texts: Methods and Applications
Elena Bolshakova ... Kirill Ivanov
-
Elena Bolshakova, et. al.Elena Bolshakova ... Kirill Ivanov
18 Mar 2019
18 Mar 2019

A Comparative Study of Arabic Part of Speech Taggers Using Literary Text Samples from Saudi Novels
Reyadh Alluhaibi ... Mohammad A R Abdeen
Information | VOL. 12
Reyadh Alluhaibi, et. al.Reyadh Alluhaibi ... Mohammad A R Abdeen
15 Dec 2021
Information | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Pattern and POS Auto-Learning Method for Terminology Extraction from Scientific Text

Abstract

Talk to us

Similar Papers

More From: Data and Information Management