Automatic term recognition based on statistics of compound nouns

Hiroshi Nakagawa

doi:10.1075/term.6.2.05nak

Abstract

The NTCIR1 TMREC group called for participation of the term recognition task which is a part of NTCIR1 held in 1999. As an activity of TMREC, they have provided us with the test collection of the term recognition task. The goal of this task is to automatically recognize and extract terms from the text corpus which consists of 1,870 abstracts gathered from the NACSIS Academic Conference Database. This article describes the term extraction method we have proposed to extract terms consisting of simple and compound nouns and the experimental evaluation of the proposed method with this NTCIR TMREC test collection. The basic idea of scoring a simple noun N of our term extraction method is to count how many nouns are conjoined with N to make compound nouns. Then we extend this score to measure the score of compound nouns because most of technical terms are compound nouns. Our method has a parameter to tune the degree of preference either for longer compound nouns or for shorter compound nouns. As for term candidates, in addition to noun sequences, we may add variations such as patterns of "A no B" that roughly means "B of A" or "A’ś B" and/or "A na B" where "A na" is an adjective. Experimental results of our method are promising, namely recall of 0.83, precision of 0.46 and F-value of 0.59 for exactly matched extracted terms when we take into account top scoring 16,000 extracted terms.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic term recognition based on statistics of compound nouns

Abstract

Talk to us

Similar Papers

More From: Terminology / International Journal of Theoretical and Applied Issues in Specialized Communication

Lead the way for us

Journal: Terminology / International Journal of Theoretical and Applied Issues in Specialized Communication	Publication Date: Dec 31, 2000
Citations: 73

Similar Papers

Automatic term recognition based on statistics of compound nouns and their components
Tatsunori Mori ... Hiroshi Nakagawa
Terminology / International Journal of Theoretical and Applied Issues in Specialized Communication | VOL. 9
Tatsunori Mori, et. al.Tatsunori Mori ... Hiroshi Nakagawa
31 Dec 2004
Terminology / International Journal of Theoretical and Applied Issues in Specialized Communication | VOL. 9

Improvement of Terminology Extraction Method for Specific Patent Search
Kazuhiko Tsuda ... Koji Tanaka
Procedia Computer Science | VOL. 35
Kazuhiko Tsuda, et. al.Kazuhiko Tsuda ... Koji Tanaka
01 Jan 2014
Procedia Computer Science | VOL. 35

TermExtract: Accuracy of Compound Noun Detection in Japanese
Vitaly Klyuev ... Motoki Miyashita
-
Vitaly Klyuev, et. al.Vitaly Klyuev ... Motoki Miyashita
01 Jan 2014
01 Jan 2014

Ode à l'odeur (les noms prédicatifs de « sensations olfactives » en russe)
Irina Thomières-Kokochkina
Revue Russe | VOL. 40
Irina Thomières-KokochkinaIrina Thomières-Kokochkina
01 Jan 2013
Revue Russe | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic term recognition based on statistics of compound nouns

Abstract

Talk to us

Similar Papers

More From: Terminology / International Journal of Theoretical and Applied Issues in Specialized Communication