CLASSIFICATION OF WEB DOCUMENTS USING GRAPH MATCHING

Adam Schenker,Mark Last,Horst Bunke,Abraham Kandel

doi:10.1142/s0218001404003241

CLASSIFICATION OF WEB DOCUMENTS USING GRAPH MATCHING

Adam Schenker, Mark Last + Show 2 more

https://doi.org/10.1142/s0218001404003241

Copy DOI

Journal: International Journal of Pattern Recognition and Artificial Intelligence	Publication Date: May 1, 2004
Citations: 69

Affiliation: University of South Florida, Ben-Gurion University of the Negev, University of Bern, Tel Aviv University

#Graph Distance Measures #Methods In Terms Of Time + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this paper we describe a classification method that allows the use of graph-based representations of data instead of traditional vector-based representations. We compare the vector approach combined with the k-Nearest Neighbor (k-NN) algorithm to the graph-matching approach when classifying three different web document collections, using the leave-one-out approach for measuring classification accuracy. We also compare the performance of different graph distance measures as well as various document representations that utilize graphs. The results show the graph-based approach can outperform traditional vector-based methods in terms of accuracy, dimensionality and execution time.

Full Text