Classification Performance of Bio-Marker and Disease Word using Word Representation Models

Young-Shin Youn,Yu-Seop Kim,Hye-Jeong Song,Kyung-Min Nam,Jong-Dae Kim,Chan-Young Park

doi:10.14257/ijbsbt.2016.8.1.26

Abstract

One of the most important processesin a machine learning-based natural language processing module is to represent words by inputting the module. This can be accomplished by representing words in one-hot form with a large vector size without applying the concept of semantic similarity between words, or by word representation (word embedding) with vectors to represent lexical similarity. This has attracted keen research interest by improving the performance of several natural language processing modelssuch as syntactic parsing and sentiment analysis (also known as opinion mining). In this study, classification performance of Word2Vec, canonical correlation analysis (CCA), and GloVeare tested on a corpus that established using the titles and abstractsof 204,674biomedical articles published in PubMed. Categories include disease name, disease symptom, and ovarian cancer marker.Ovarian cancer markers were used as biomarkers.The classification performance of each word representation model for each category is visualized by mapping the results in two-dimensional word representations using t-distributed stochastic neighbor embedding (t-SNE).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Classification Performance of Bio-Marker and Disease Word using Word Representation Models

Abstract

Talk to us

Similar Papers

More From: International Journal of Bio-Science and Bio-Technology

Lead the way for us

Similar Papers

Wore Representation Analysis of Bio-marker and Disease Word
Young-Shin Youn ... Chan-Young Park
-
Young-Shin Youn, et. al.Young-Shin Youn ... Chan-Young Park
19 Dec 2015
19 Dec 2015

A method of inferring the relationship between Biomedical entities through correlation analysis on text
Hye-Jeong Song ... Byeong-Hun Yoon
BioMedical Engineering OnLine | VOL. 17
Hye-Jeong Song, et. al.Hye-Jeong Song ... Byeong-Hun Yoon
01 Nov 2018
BioMedical Engineering OnLine | VOL. 17

Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach
Wei-Hung Weng ... Alexa T Mccray
BMC Medical Informatics and Decision Making | VOL. 17
Wei-Hung Weng, et. al.Wei-Hung Weng ... Alexa T Mccray
01 Dec 2017
BMC Medical Informatics and Decision Making | VOL. 17

A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models
Usman Naseem ... Shah Khalid Khan
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 20
Usman Naseem, et. al.Usman Naseem ... Shah Khalid Khan
30 Jun 2021
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification Performance of Bio-Marker and Disease Word using Word Representation Models

Abstract

Talk to us

Similar Papers

More From: International Journal of Bio-Science and Bio-Technology