Dual active feature and sample selection for graph classification

Xiangnan Kong,Philip S Yu,Wei Fan

doi:10.1145/2020408.2020511

Abstract

Graph classification has become an important and active research topic in the last decade. Current research on graph classification focuses on mining discriminative subgraph features under supervised settings. The basic assumption is that a large number of labeled graphs are available. However, labeling graph data is quite expensive and time consuming for many real-world applications. In order to reduce the labeling cost for graph data, we address the problem of how to select the most important graph to query for the label. This problem is challenging and different from conventional active learning problems because there is no predefined feature vector. Moreover, the subgraph enumeration problem is NP-hard. The active sample selection problem and the feature selection problem are correlated for graph data. Before we can solve the active sample selection problem, we need to find a set of optimal subgraph features. To address this challenge, we demonstrate how one can simultaneously estimate the usefulness of a query graph and a set of subgraph features. The idea is to maximize the dependency between subgraph features and graph labels using an active learning framework. We propose a branch-and-bound algorithm to search for the optimal query graph and optimal features simultaneously. Empirical studies on nine real-world tasks demonstrate that the proposed method can obtain better accuracy on graph data than alternative approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dual active feature and sample selection for graph classification

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Semi-supervised feature selection for graph classification
Xiangnan Kong ... Philip S. Yu
-
Xiangnan Kong, et. al.Xiangnan Kong ... Philip S. Yu
25 Jul 2010
25 Jul 2010

GMLC: a multi-label feature selection framework for graph classification
Xiangnan Kong ... Philip S Yu
Knowledge and Information Systems | VOL. 31
Xiangnan Kong, et. al.Xiangnan Kong ... Philip S Yu
08 May 2011
Knowledge and Information Systems | VOL. 31

Mining Brain Networks Using Multiple Side Views for Neurological Disorder Identification
Bokai Cao ... Ann B Ragin
-
Bokai Cao, et. al.Bokai Cao ... Ann B Ragin
01 Nov 2015
01 Nov 2015

Incremental Subgraph Feature Selection for Graph Classification
Haishuai Wang ... Peng Zhang
IEEE Transactions on Knowledge and Data Engineering | VOL. 29
Haishuai Wang, et. al.Haishuai Wang ... Peng Zhang
01 Jan 2017
IEEE Transactions on Knowledge and Data Engineering | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dual active feature and sample selection for graph classification

Abstract

Talk to us

Similar Papers