Detecting Hypernym/Hyponym in Science and Technology Thesaurus Using Entropy-Based Clustering of Word Vectors

Takahiro Kawamura,Motoki Sekine,Kazuaki Matsumura

doi:10.1142/s1793351x17400177

Abstract

Thesauri for science and technology information are increasingly used in bibliometrics and scientometrics. However, the manual construction and maintenance of thesauri are costly and time consuming; thus, methods for semi-automatic construction and maintenance are being actively studied. We propose a method that expands an existing thesaurus with specified terms extracted from the abstracts of articles. Specifically, we assign the terms to certain subcategories by our novel clustering method based on information entropy for word vectors. Then, we determine the hypernyms and hyponyms based on their relations with terms in the subcategories. The word vectors are constructed from 177,000 IEEE articles archived from 2012 to 2014 in the Scopus dataset. In experiments, the terms were correctly classified into the Japan Science and Technology thesaurus with 83.3% precision and 71.4% recall. In future, we will develop a semi-automatic thesaurus maintenance system that recommends new terms in their proper relative positions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Detecting Hypernym/Hyponym in Science and Technology Thesaurus Using Entropy-Based Clustering of Word Vectors

Abstract

Talk to us

Similar Papers

More From: International Journal of Semantic Computing

Lead the way for us

Journal: International Journal of Semantic Computing	Publication Date: Dec 1, 2017
Citations: 1

Similar Papers

Hyponym/Hypernym Detection in Science and Technology Thesauri from Bibliographic Datasets
Takahiro Kawaumra ... Katsuji Matsumura
Control theory & applications | VOL. -
Takahiro Kawaumra, et. al.Takahiro Kawaumra ... Katsuji Matsumura
01 Jan 2017
Control theory & applications | VOL. -

Expanding Science and Technology Thesauri from Bibliographic Datasets Using Word Embedding
Takahiro Kawamura ... Katsuji Matsumura
-
Takahiro Kawamura, et. al.Takahiro Kawamura ... Katsuji Matsumura
01 Nov 2016
01 Nov 2016

Introduction to bibliometrics for construction and maintenance of thesauri
Jesper W Schneider ... Pia Borlund
Journal of Documentation | VOL. 60
Jesper W Schneider, et. al.Jesper W Schneider ... Pia Borlund
01 Oct 2004
Journal of Documentation | VOL. 60

Bahasa Indonesia pre-trained word vector generation using word2vec for computer and information technology field
Syarifah K Putri ... E B Nababan
Journal of Physics: Conference Series | VOL. 1898
Syarifah K Putri, et. al.Syarifah K Putri ... E B Nababan
01 Jun 2021
Journal of Physics: Conference Series | VOL. 1898

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting Hypernym/Hyponym in Science and Technology Thesaurus Using Entropy-Based Clustering of Word Vectors

Abstract

Talk to us

Similar Papers

More From: International Journal of Semantic Computing