Semi-Automatic Indonesian WordNet Establishment: From Synset Extraction to Visual Editor

Gunawan ,I Ketut Eddy Purnama,Mochamad Hariadi

doi:10.14257/ijmue.2016.11.8.02

Abstract

In this study, we have developed an Indonesian WordNet through four main phases: synonym set extraction (synset) as the smallest entity of lexical database from a natural language, semantic relation establishment between synsets (hypernym-hyponym and holonym-meronym), gloss extraction for synset collection, and the visual editor creation. The Semi-automatic term refers to the three initial phases which are automatically done using a number of machine learning approaches, while using visual editor to collaboratively complement the results collected from the previous phases. A number of raw data used on synset acquisition, semantic relations and glosses come from Kamus Besar Bahasa Indonesia (Great Dictionary of the Indonesian Language, abbreviated as KBBI) and Tesaurus Bahasa Indonesia (Indonesian Language Thesaurus), large collection of web pages from search engines, Wikipedia, and even Princeton WordNet for mapping purpose. This study shows that the proposed system successfully achieve 37,485 synsets, 24,256 hypernym-hyponym relations, 11,044 holonym-meronym relations and 6,520 gloss synsets. Similar approach is believed to accelerate lexical database development like WordNet for other languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semi-Automatic Indonesian WordNet Establishment: From Synset Extraction to Visual Editor

Abstract

Talk to us

Similar Papers

More From: International Journal of Multimedia and Ubiquitous Engineering

Lead the way for us

Journal: International Journal of Multimedia and Ubiquitous Engineering	Publication Date: Aug 31, 2016
Citations: 14

Similar Papers

A Shared Fragments Analysis System for Large Collections of Web Pages
Junchang Ma ... Zhimin Gu
-
Junchang Ma, et. al.Junchang Ma ... Zhimin Gu
01 Jan 2006
01 Jan 2006

The Design of Lexical Database for Indonesian Language
D Gunawan ... A Amalia
IOP Conference Series: Materials Science and Engineering | VOL. 180
D Gunawan, et. al.D Gunawan ... A Amalia
01 Mar 2017
IOP Conference Series: Materials Science and Engineering | VOL. 180

An XML based Web Crawler with Page Revisit Policy and Updation in Local Repository of Search Engine
Jyoti Mor ... Dr Naresh Kumar
International Journal of Engineering & Technology | VOL. 7
Jyoti Mor, et. al.Jyoti Mor ... Dr Naresh Kumar
23 Jun 2018
International Journal of Engineering & Technology | VOL. 7

GERMAN ENTRIES IN KAMUS BESAR BAHASA INDONESIA V
Julia Wulandari ... Shabrina Nabila Kiasati
International Review of Humanities Studies | VOL. -
Julia Wulandari, et. al.Julia Wulandari ... Shabrina Nabila Kiasati
01 Nov 2019
International Review of Humanities Studies | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semi-Automatic Indonesian WordNet Establishment: From Synset Extraction to Visual Editor

Abstract

Talk to us

Similar Papers

More From: International Journal of Multimedia and Ubiquitous Engineering