Utilizing Local Outlier Factor for Open-Set Classification in High-Dimensional Data - Case Study Applied for Text Documents

Tomasz Walkowiak,Henryk Maciejewski,Szymon Datko

doi:10.1007/978-3-030-29516-5_33

Utilizing Local Outlier Factor for Open-Set Classification in High-Dimensional Data - Case Study Applied for Text Documents

Tomasz Walkowiak, Henryk Maciejewski + Show 1 more

https://doi.org/10.1007/978-3-030-29516-5_33

Copy DOI

Publication Date: Aug 24, 2019

Citations: 1

Affiliation: Wrocław University of Science and Technology, AGH University of Krakow

#Local Outlier Factor Algorithm #Local Outlier Factor + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this paper, we focus on the utilization of Local Outlier Factor (LOF) algorithm in the task of performing open-set classification on high-dimensional data. Concerning the application on text documents, we research the fastText method for extraction of feature vectors. Then we build a classifier and evaluate its accuracy (precision, recall) on prepared test data, containing both subject categories known during training and completely new categories. Next we attempt to identify incorrect outcomes related to assigning the documents of new categories to one of the trained classes; for this we use the Local Outlier Factor algorithm. We show how decision function threshold in LOF influences the precision and recall of this open-set classification procedure.

Full Text