WordNet-based hybrid VSM for Document Classification

Luda Wang,Shouping Gao,Peng Zhang

doi:10.14257/ijdta.2016.9.1.16

Abstract

Many text classifications depend on statistical term measures or synsets to implement document representation. Such document representations ignore the lexical semantic contents or relations of terms, leading to losing the distilled mutual information. This work proposed a synthetic document representation method, WordNet-based hybrid VSM, to solve the problem. This method constructed a data structure of semantic-element information to characterize lexical semantic contents, and support disambiguation of word stems. As a template, lexical semantic vector consisting of lexical semantic contents was built in the lexical semantic space of corpus, and lexical semantic relations are marked on the vector. Then, it connects with special term vector to form the eigenvector in hybrid VSM. Applying algorithm NWKNN, on text corpus Reuter-21578 and its adjusted version, the experiments show that the eigenvector performs F1 measure better than document representations based on TF-IDF.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

WordNet-based hybrid VSM for Document Classification

Abstract

Talk to us

Similar Papers

More From: International Journal of Database Theory and Application

Lead the way for us

Journal: International Journal of Database Theory and Application	Publication Date: Jan 31, 2016
Citations: 14

Similar Papers

Translating lexical semantic relations
Chu-Ren Huang ... Dylan B S Tsai
-
Chu-Ren Huang, et. al.Chu-Ren Huang ... Dylan B S Tsai
01 Jan 2002
01 Jan 2002

Acquisition of Basic Lexical Semantic Conceptual Relations Using Specialized Dictionaries
Velislava Stoykova
Procedia - Social and Behavioral Sciences | VOL. 93
Velislava StoykovaVelislava Stoykova
01 Oct 2013
Procedia - Social and Behavioral Sciences | VOL. 93

Lexical-semantic SLVM for XML Document Classification
Jun Long ... Zude Li
Journal of Software | VOL. -
Jun Long, et. al.Jun Long ... Zude Li
12 Jan 2014
Journal of Software | VOL. -

Bridging the Defined and the Defining: Exploiting Implicit Lexical Semantic Relations in Definition Modeling
Koki Washio ... Tsuneaki Kato
-
Koki Washio, et. al.Koki Washio ... Tsuneaki Kato
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

WordNet-based hybrid VSM for Document Classification

Abstract

Talk to us

Similar Papers

More From: International Journal of Database Theory and Application