Abstract

SummaryLearning and analyzing graph data is one of the most fundamental research areas in machine learning and data mining. Among numerous graph‐based data structures, this paper focuses on a graph bag (simply, bag), which corresponds to a training object containing one or more graphs, and a label is available only for a bag. This type of a bag can represent various real‐world objects such as drugs, web pages, XML documents, and images, among many others, and there have been many researches on models for learning this type of bag data. Within this research context, we define a novel problem of dynamic graph bag classification, and propose an algorithm to solve this problem. Dynamic bag classification aims to build a classification model for bags, which are presented in a streaming fashion, ie, frequent emerging of new bags or graphs over time. Given such changes made to the bag dataset, our proposed algorithm aims to update incrementally the top‐m most discriminative features instead of searching for them from scratch. Incremental gSpan and incremental gScore are proposed as core parts of our algorithm to deal with a stream of bags efficiently. We evaluate our algorithm on two real‐world datasets in terms of both feature selection time and classification accuracy. The experimental results demonstrate that our algorithm derives an informative feature set much faster than the existing one originally designed for targeting static bag data, with little accuracy loss.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call