Abstract

With the application of data mining in many fields such as information science, bioinformatics, and network intrusion detection, more and more data are showing new features such as strong structuration and complex relationships between data. As a complex data structure, a graph can be used to describe the relationship between things. Traditional graph classification methods based on graph feature vector construction need to select a feature vector construction criterion in advance, such as graph-based theoretical indicators or graph-based topology occurrences, and then extract features from each graph in the graph set according to the designated criterion. However, the construction method of the graph feature vector is easy to lose the graph structural information and requires strong professional knowledge. Inspired by the Word2Vec and Doc2Vec models in the Natural Language Processing (NLP), this paper first constructs a “word list” of graph data consisting of subgraphs. Then a neural network for training graph embedding is designed with the graph itself as its input, and the “word” in the graph and the attribute features of the graph are used as its output, so that the neural network automatically learns the graph embedding corresponding to each graph. The graph embedding not only reflects the features of the graph itself but also includes the relative relationship among graphs. Finally, on the basis of the well-trained graph embedding, the common classifier can be used to classify graphs. Based on real-world bioinformatics and social data sets, the experiments demonstrate that the proposed graph classification algorithm has advantages over the existing graph classification algorithms based on feature vector construction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call