Abstract

This paper presents an efficient graph semisupervised learning (GSSL) method that meets the criterion of optimization without iterations. Most existing GSSL methods require iterative optimization to achieve a preset objective because they consider data points to be in peer-to-peer relationships. Additionally, existing GSSL methods must learn from scratch for unseen data because graph structures are specifically built for a given dataset. By leveraging the partial order relationships induced by the local density and distances between data, we developed a novel label propagation algorithm based on the data structure of an optimal leading forest (OLeaF). The time complexity of our method is O(N) for both labeling unclassified data and labeling new data from a dataset after an OLeaF is constructed. Therefore, the two main weaknesses of traditional GSSL are addressed. Additionally, the constructed leading forest offers good interpretability for learning results. We scale the proposed method to accommodate big data by utilizing the block distance matrix technique and locality-sensitive hashing. Extensive experiments on datasets with different characteristics demonstrate the superior efficiency and competitive accuracy of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call