Hyperbolic Function Embedding: Learning Hierarchical Representation for Functions of Source Code in Hyperbolic Space

Mingming Lu,Haifeng Li,Wendbo Li,Wenjie Bi,Xiaoxian He,Dingwu Tan,Yan Liu

doi:10.3390/sym11020254

Mingming Lu, Haifeng Li + Show 5 more

Open Access

https://doi.org/10.3390/sym11020254

Copy DOI

Abstract

Recently, source code mining has received increasing attention due to the rapid increase of open-sourced code repositories and the tremendous values implied in this large dataset, which can help us understand the organization of functions or classes in different software and analyze the impact of these organized patterns on the software behaviors. Hence, learning an effective representation model for the functions of source code, from a modern view, is a crucial problem. Considering the inherent hierarchy of functions, we propose a novel hyperbolic function embedding (HFE) method, which can learn a distributed and hierarchical representation for each function via the Poincaré ball model. To achieve this, a function call graph (FCG) is first constructed to model the call relationship among functions. To verify the underlying geometry of FCG, the Ricci curvature model is used. Finally, an HFE model is built to learn the representations that can capture the latent hierarchy of functions in the hyperbolic space, instead of the Euclidean space, which are usually used in those state-of-the-art methods. Moreover, HFE is more compact in terms of lower dimensionality than the existing graph embedding methods. Thus, HFE is more effective in terms of computation and storage. To experimentally evaluate the performance of HFE, two application scenarios, namely, function classification and link prediction, have been applied. HFE achieves up to 7.6% performance improvement compared to the chosen state-of-the-art methods, namely, Node2vec and Struc2vec.

Highlights

There are billions of lines of source code (e.g., GitHub) open to the software community on the Internet
We propose a novel hyperbolic function embedding method, which can learn a distributed and hierarchical representation for each function via the Poincaré ball model
We use the Ricci curvature to describe the intrinsic geometry of the function call graph (FCG)

Summary

Introduction

There are billions of lines of source code (e.g., GitHub) open to the software community on the Internet. Among them, inspired by word2vec [5], various embedding techniques that learn function representations have received a great deal of attention because the learned features of functions by embedding into vector space can compactly encode the latent semantic structure Those embedding vectors can achieve better performance as pre-trained inputs to machine learning models. We use Ricci curvature [15] to estimate the geometric structure of the FCG and identify that the curvature for most of the edges in the FCG are negative This phenomenon suggests hyperbolic space instead of Euclidean space as a natural embedding space for FCG since hyperbolic space is usually associated with constant negative curvature [16].

Related Works

Overview

Function Call Graphs

Function

Certain

FCG and Ricci Curvature

RC-FCG and Hyperbolic Space

Learning via the Poincaré

Hyperbolic Distance for the Poincaré Ball Model d

Loss Function and Optimization

Dataset

Baselines

Clustering

Visualization

Findings

Conclusions and Future Works

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Feb 18, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Hyperbolic Function Embedding: Learning Hierarchical Representation for Functions of Source Code in Hyperbolic Space

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Knowledge Graph Representation via Hierarchical Hyperbolic Neural Graph Embedding
Shen Wang ... Xiaokai Wei
-
Shen Wang, et. al.Shen Wang ... Xiaokai Wei
15 Dec 2021
15 Dec 2021

Hyperbolic Embedding of Attributed and Directed Networks
David Mcdonald ... Shan He
IEEE Transactions on Knowledge and Data Engineering | VOL. -
David Mcdonald, et. al.David Mcdonald ... Shan He
01 Jan 2021
IEEE Transactions on Knowledge and Data Engineering | VOL. -

Systematic comparison of graph embedding methods in practical tasks.
Yi-Jiao Zhang ... Kai-Cheng Yang
Physical Review E | VOL. 104
Yi-Jiao Zhang, et. al.Yi-Jiao Zhang ... Kai-Cheng Yang
22 Oct 2021
Physical Review E | VOL. 104

HME
Shanshan Feng ... Gao Cong
-
Shanshan Feng, et. al.Shanshan Feng ... Gao Cong
25 Jul 2020
25 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hyperbolic Function Embedding: Learning Hierarchical Representation for Functions of Source Code in Hyperbolic Space

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry