Abstract
Heterogeneous Information Network (HIN) collecitve classification studies the problem of predicting labels for one type of nodes in a HIN which contains multiple types of nodes multiple types of links among them. Previous studies have revealed that exploiting relative importance of links is quite useful to improve node classification performance as connected nodes tend to have similar labels. Most existing approaches exploit the relative importance of links either by directly counting the number of connections among nodes or by learning the weight of each type of link from labeled data only. However, these approaches either neglect the importance of types of links to the class labels or may lead to overfitting problem. We propose a <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T</b> ensor-based <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Mark</b> ov chain (T-Mark) approach, which is able to automatically and simultaneously predict the labels for unlabeled nodes and give the relative importance of types of links that actually improve the classification accuracy. Specifically, we build two tensor equations by using the HIN and features of nodes from both labeled and unlabeled data. A Markov chain-based model is proposed and it is solved by an iterative process to obtain the stationary distributions. Theoretical analyses of the existence and uniqueness of such probability distributions are given. Extensive experimental results demonstrate that T-Mark is able to achieve superior performance in the comparison and obtain reasonable relative importance of links.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.