Abstract
Spectral analysis is critical in social network analysis. As a vital step of the spectral analysis, the graph construction in many existing works utilizes content data only. Unfortunately, the content data often consists of noisy, sparse, and redundant features, which makes the resulting graph unstable and unreliable. In practice, besides the content data, social network data also contain link information, which provides additional information for graph construction. Some of previous works utilize the link data. However, the link data is often incomplete, which makes the resulting graph incomplete. To address these issues, we propose a novel Distilled Graph Clustering (DGC) method. It pursuits a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">distilled graph</i> based on both the content data and the link data. The proposed algorithm alternates between two steps: in the feature selection step, it finds the most representative feature subset w.r.t. an intermediate graph initialized with link data; in graph distillation step, the proposed method updates and refines the graph based on only the selected features. The final resulting graph, which is referred to as the distilled graph, is then utilized for spectral clustering on the large-scale social network data. Extensive experiments demonstrate the superiority of the proposed method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.