ON THE EFFECTIVENESS OF PARAGRAPH VECTOR MODELS IN DOCUMENT SIMILARITY ESTIMATION FOR TURKISH NEWS CATEGORIZATION

Ali Yürekli̇

doi:10.18038/estubtda.1175001

Abstract

News categorization, which is a common application area of text classification, is the task of automatic annotation of news articles with predefined categories. In parallel with the rise of deep learning techniques in the field of machine learning, neural embedding models have been widely utilized to capture hidden relationships and similarities among textual representations of news articles. In this study, we approach the Turkish news categorization problem as an ad-hoc retrieval task and investigate the effectiveness of paragraph vector models to compute and utilize document-wise similarities of Turkish news articles. We propose an ensemble categorization approach that consists of three main stages, namely, document processing, paragraph vector learning, and document similarity estimation. Extensive experiments conducted on the TTC-3600 dataset reveal that the proposed system can reach up to 93.5% classification accuracy, which is a remarkable performance when compared to the baseline and state-of-the-art methods. Moreover, it is also shown that the Distributed Bag of Words version of Paragraph Vectors performs better than the Distributed Memory Model of Paragraph Vectors in terms of both accuracy and computational performance.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

ON THE EFFECTIVENESS OF PARAGRAPH VECTOR MODELS IN DOCUMENT SIMILARITY ESTIMATION FOR TURKISH NEWS CATEGORIZATION

Abstract

Published Version

Talk to us

Similar Papers

More From: Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering

Lead the way for us

Journal: Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering	Publication Date: Mar 29, 2023
Citations: 1

Similar Papers

Neural sentence embedding models for semantic similarity estimation in the biomedical domain
Kathrin Blagec ... Asan Agibetov
BMC Bioinformatics | VOL. 20
Kathrin Blagec, et. al.Kathrin Blagec ... Asan Agibetov
11 Apr 2019
BMC Bioinformatics | VOL. 20

Analysis of the Paragraph Vector Model for Information Retrieval
Qingyao Ai ... Liu Yang
-
Qingyao Ai, et. al.Qingyao Ai ... Liu Yang
12 Sep 2016
12 Sep 2016

Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval
Qingyao Ai ... Jiafeng Guo
-
Qingyao Ai, et. al.Qingyao Ai ... Jiafeng Guo
07 Jul 2016
07 Jul 2016

MONERS: A news recommender for the mobile web
H.J Lee ... Sung Joo Park
Expert Systems with Applications | VOL. 32
H.J Lee, et. al.H.J Lee ... Sung Joo Park
27 Dec 2005
Expert Systems with Applications | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

ON THE EFFECTIVENESS OF PARAGRAPH VECTOR MODELS IN DOCUMENT SIMILARITY ESTIMATION FOR TURKISH NEWS CATEGORIZATION

Abstract

Published Version

Talk to us

Similar Papers

More From: Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering