A Method for Refining a Taxonomy by Using Annotated Suffix Trees and Wikipedia Resources

Ekaterina Chernyak,Boris Mirkin

doi:10.1016/j.procs.2014.05.260

Ekaterina Chernyak, Boris Mirkin

Open Access

https://doi.org/10.1016/j.procs.2014.05.260

Copy DOI

Abstract

A two-step approach to taxonomy construction is presented. On the first step the frame of taxonomy is built manually according to some representative educational materials. On the second step, the frame is refined using the Wikipedia category tree and articles. Since the structure of Wikipedia is rather noisy, a procedure to clear the Wikipedia category tree is suggested. A string-to-text relevance score, based on annotated suffix trees, is used several times to 1) clear the Wikipedia data from noise; 2) to assign Wikipedia categories to taxonomy topics; 3) to choose whether the category should be assigned to the taxonomy topic or stay on intermediate levels. The resulting taxonomy consists of three parts: the manully set upper levels, the adopted Wikipedia category tree and the Wikipedia articles as leaves.Also, a set of so-called descriptors is assigned to every leaf; these are phrases explaining aspects of the leaf topic. The method is illustrated by its application to two domains: a) Probability theory and mathematical statistics, b) “Numerical analysis” (both in Russian).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia Computer Science	Publication Date: Jan 1, 2014
Citations: 1	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

A Method for Refining a Taxonomy by Using Annotated Suffix Trees and Wikipedia Resources

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Similar Papers

Refining a Taxonomy by Using Annotated Suffix Trees and Wikipedia Resources
Ekaterina Chernyak ... Boris Mirkin
Annals of Data Science | VOL. 2
Ekaterina Chernyak, et. al.Ekaterina Chernyak ... Boris Mirkin
01 Mar 2015
Annals of Data Science | VOL. 2

Utilising Wikipedia for Text Mining Applications
Muhammad Atif Qureshi
ACM SIGIR Forum | VOL. 49
Muhammad Atif QureshiMuhammad Atif Qureshi
29 Jan 2016
ACM SIGIR Forum | VOL. 49

Self-Organization with Additional Learning Based on Category Mapping and Its Application to Dynamic News Clustering
Tetsuya Toyota ... Hajime Nobuhara
IEEJ Transactions on Electronics, Information and Systems | VOL. 132
Tetsuya Toyota, et. al.Tetsuya Toyota ... Hajime Nobuhara
01 Jan 2012
IEEJ Transactions on Electronics, Information and Systems | VOL. 132

Mining Relations between Wikipedia Categories
Julian Szymański
-
Julian SzymańskiJulian Szymański
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Method for Refining a Taxonomy by Using Annotated Suffix Trees and Wikipedia Resources

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science