Abstract

This paper examines the use of linguistic techniques in the area of automatic term recognition. It describes the TRUCKS model, which makes use of different types of contextual information-syntactic, semantic, terminological and statistical-seeking particularly to identify those parts of the context which are most relevant to terms. From an initial corpus of sublanguage texts, this identifies, disambiguates and ranks candidate terms. The system is evaluated with respect to the statistical approach on which it is built, and with respect to its expected theoretical performance. We show that by using deeper forms of contextual information, we can improve on the extraction of multi-word terms. The resulting list of ranked terms is shown to improve on that produced by traditional methods, in terms of precision and distribution, while the information acquired in the process can also be used for a variety of other applications, such as disambiguation, lexical tuning and term clustering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call