학습문헌집합에 기 부여된 범주의 정확성과 문헌 범주화 성능

Kyung Won Shim ,Young-Mee Chung

doi:10.3743/kosim.2006.23.2.265

Abstract

문헌범주화에서는 학습문헌집합에 부여된 주제범주의 정확성이 일정 수준을 가진다고 가정한다. 그러나, 이는 실제 문헌집단에 대한 지식이 없이 이루어진 가정이다. 본 연구는 실제 문헌집단에서 기 부여된 주제범주의 정확성의 수준을 알아보고, 학습문헌집합에 기 부여된 주제범주의 정확도와 문헌범주화 성능과의 관계를 확인하려고 시도하였다. 특히, 학습문헌집합에 부여된 주제범주의 질을 수작업 재색인을 통하여 향상시킴으로써 어느 정도까지 범주화 성능을 향상시킬 수 있는가를 파악하고자 하였다. 이를 위하여 과학기술분야의 1,150 초록 레코드 1,150건을 전문가 집단을 활용하여 재색인한 후, 15개의 중복문헌을 제거하고 907개의 학습문헌집합과 227개의 실험문헌집합으로 나누었다. 이들을 초기문헌집단, Recat-1, Recat-2의 재 색인 이전과 이후 문헌집단의 범주화 성능을 kNN 분류기를 이용하여 비교하였다. 초기문헌집단의 범주부여 평균 정확성은 16%였으며, 이 문헌집단의 범주화 성능은 F1값으로 17%였다. 반면, 주제범주의 정확성을 향상시킨 Recat-1 집단은 F1값 61%로 초기문헌집단의 성능을 3.6배나 향상시켰다.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

학습문헌집합에 기 부여된 범주의 정확성과 문헌 범주화 성능

Abstract

Talk to us

Similar Papers

More From: Journal of the Korean Society for information Management

Lead the way for us

Journal: Journal of the Korean Society for information Management	Publication Date: Jun 1, 2006
License type: cc-by-nc-nd

Similar Papers

A hybrid approach for text categorization by using x2 statistic, principal component analysis and particle swarm optimization

Scientific Research and Essays | VOL. 8

04 Oct 2013
Scientific Research and Essays | VOL. 8

A Text Feature Selection Method Based on the Small World Algorithm
Yonghe Lu ... Yongshan Chen
Procedia Computer Science | VOL. 107
Yonghe Lu, et. al.Yonghe Lu ... Yongshan Chen
01 Jan 2017
Procedia Computer Science | VOL. 107

A Review on Supervised Machine Learning Text Categorization Approaches
Aayushi A Shah ... Keyur Rana
-
Aayushi A Shah, et. al.Aayushi A Shah ... Keyur Rana
01 Dec 2018
01 Dec 2018

Handling imbalanced dataset in multi-label text categorization using Bagging and Adaptive Boosting
Genta Indra Winata ... Masayu Leylia Khodra
-
Genta Indra Winata, et. al.Genta Indra Winata ... Masayu Leylia Khodra
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

학습문헌집합에 기 부여된 범주의 정확성과 문헌 범주화 성능

Abstract

Talk to us

Similar Papers

More From: Journal of the Korean Society for information Management