Abstract

Collective classification (CC) is a task to jointly classifying related instances of network data. Enabling CC usually improves the performance of predictive models on fully-labeled training networks with large amount of labeled data. However, acquiring such labels can be difficult and costly, and learning a CC classifier with only a few labeled data can lead to poor performance. On the other hand, there are usually large amount of unlabeled data available in practical. This naturally motivates semi-supervised collective classification (SSCC) approaches for leveraging the unlabeled data to improve CC from a sparsely-labeled network. In this paper, we propose a novel non-negative matrix factorization (NMF) based SSCC algorithm, called NMF-SSCC, to effectively learn a data representation by exploiting both labeled and unlabeled data on the network. Our idea is to use matrix factorization to obtain a compact representation of network data which uncovers the class discrimination of the data inferred from the labeled instances and simultaneously respects the intrinsic network structure. To achieve this, we design a new matrix factorization objective function and incorporate a label matrix factorization term as well as a network regularization term into it. An efficient optimization algorithm using the multiplicative updating rules is then developed to solve the new objective function. To further boost the predicting performance, we extend the proposed NMF-SSCC method into an ensemble scheme, called NMFE-SSCC, in terms of building a classification ensemble with a set of NMF-SSCC collective classifiers using different constructed latent graphs. Each NMF-SSCC classifier is learnt from one latent graph generated with various latent linkages for effectively label propagation. Experimental results on real-world data sets have demonstrated the effectiveness of the new methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.