Abstract

Representing patterns as labeled graphs is becoming increasingly common in the broad field of computational intelligence. Accordingly, a wide repertoire of pattern recognition tools, such as classifiers and knowledge discovery procedures, are nowadays available and tested for various datasets of labeled graphs. However, the design of effective learning procedures operating in the space of labeled graphs is still a challenging problem, especially from the computational complexity viewpoint. In this paper, we present a major improvement of a general-purpose classifier for graphs, which is conceived on an interplay between dissimilarity representation, clustering, information-theoretic techniques, and evolutionary optimization algorithms. The improvement focuses on a specific key subroutine devised to compress the input data. We prove different theorems which are fundamental to the setting of the parameters controlling such a compression operation. We demonstrate the effectiveness of the resulting classifier by benchmarking the developed variants on well-known datasets of labeled graphs, considering as distinct performance indicators the classification accuracy, computing time, and parsimony in terms of structural complexity of the synthesized classification models. The results show state-of-the-art standards in terms of test set accuracy and a considerable speed-up for what concerns the computing time.

Highlights

  • Graphs offer powerful models for representing patterns characterized by interacting elements, both in static and dynamic scenarios

  • The first one performs a randomized selection of the training graphs to develop the dissimilarity representation of the input data. This system adopts the same three-weight edit scheme (TWEC) used in Optimized Dissimilarity Space Embedding (ODSE) and performs the classification in the DS by means of a k-nearest neighbors (k-NN) classifier equipped with the Euclidean distance

  • We have presented different variants of the improved ODSE graph classification system

Read more

Summary

Introduction

Graphs offer powerful models for representing patterns characterized by interacting elements, both in static and dynamic scenarios. The Optimized Dissimilarity Space Embedding (ODSE) system has been proposed as a labeled graph classifier, achieving state-of-the-art results in terms of classification accuracy on well-known benchmarking datasets [11]. The system estimates the informativeness of the input data dissimilarity representation by calculating the quadratic Rényi entropy (QRE) [39] Such an entropic characterization has been used in the compression–expansion scheme as well as an important factor of the ODSE objective function. We elaborate further over the same CBC scheme first introduced in [40] by estimating the differential α-order Rényi entropy of the DVs by means of a faster technique that relies on an entropic Minimum Spanning Tree (MST) In this case, we give a formal proof pertaining the setting of the clustering algorithm governing the compression operation.

Differential Rényi Entropy Estimators
The QRE Estimator
The MST-Based Estimator
The Original ODSE Graph Classifier
The ODSE Objective Function
The ODSE Compression Operation
The ODSE Expansion Operation
The Improved ODSE Graph Classifier
Randomized Representation Set Initialization
Compression by a Clustering-Based Subset Selection
Expansion Based on Replacement with Maximum Dissimilar Graphs
Analysis of Computational Complexity
ODSE with Mode Seeking Initialization
The Efficiency of the ODSE Clustering-Based Compression
ODSE with the MST-Based Rényi Entropy Estimator
Datasets
Experimental Setting
Results and Discussion
Conclusions and Future Directions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call