MwTExt: automatic extraction of multi-word terms to generate compound concepts within ontology

Pratik Thanawala,Jyoti Pareek

doi:10.1007/s41870-018-0111-6

Abstract

Multiword expressions are omnipresent element of natural language, whose construal as a linguistic resource has significant importance in various applications. This paper presents an architecture-MwTExt, for automatic extraction of multi-word terms-MWTs from such expressions within un-annotated English documents. Natural Language Processing techniques such as Shallow parsing and syntactic structure analysis are used to extract MWTs, with specific focus on lexical patterns as (Noun Preposition Noun), (Noun Preposition Noun + Noun) and (Noun Preposition Noun Preposition Noun). The MWTs extracted can be further used to form compound concepts within Ontology. The lexical descriptions of MWTs are encoded in Web Ontology Language OWL/XML. MwTExt has been tested on Computer Science domain texts, and the results obtained are compared with those obtained by Text2Onto, an Ontology learning tool and term extractors such as TermRaider and TerMine. The result signifies that MwTExt performs better for extraction of accurate lexicalized MWTs with average precision of 97%.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MwTExt: automatic extraction of multi-word terms to generate compound concepts within ontology

Abstract

Talk to us

Similar Papers

More From: International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management

Lead the way for us

Journal: International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management	Publication Date: Feb 21, 2018
Citations: 5

Similar Papers

Validação de termos de domínio por meio de uma base lexical-semântica difusa

Tradterm | VOL. 30

20 Dec 2017
Tradterm | VOL. 30

Language Learning Research at the Intersection of Experimental, Computational, and Corpus‐Based Approaches
Patrick Rebuschat ... Detmar Meurers
Language Learning | VOL. 67
Patrick Rebuschat, et. al.Patrick Rebuschat ... Detmar Meurers
01 Jun 2017
Language Learning | VOL. 67

Detecting Multiword Expressions and Named Entities in Natural Language Texts
István Nagy
-
István NagyIstván Nagy
19 Feb 2016
19 Feb 2016

A computer-aided environment for generating multiple-choice test items
Ruslan Mitkov ... Nikiforos Karamanis
Natural language engineering | VOL. 12
Ruslan Mitkov, et. al.Ruslan Mitkov ... Nikiforos Karamanis
22 May 2006
Natural language engineering | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MwTExt: automatic extraction of multi-word terms to generate compound concepts within ontology

Abstract

Talk to us

Similar Papers

More From: International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management