Abstract
A problem that frequently occurs when mining complex networks is selecting algorithms with which to rank the relevance of nodes to metadata groups characterized by a small number of examples. The best algorithms are often found through experiments on labeled networks or unsupervised structural community quality measures. However, new networks could exhibit characteristics different from the labeled ones, whereas structural community quality measures favor dense congregations of nodes but not metadata groups spanning a wide breadth of the network. To avoid these shortcomings, in this work we propose using unsupervised measures that assess node rank quality across multiple metadata groups through their ability to reconstruct the local structures of network nodes; these are retrieved from the network and not assumed. Three types of local structures are explored: linked nodes, nodes up to two hops away and nodes forming triangles. We compare the resulting measures alongside unsupervised structural community quality ones to the AUC and NDCG of supervised evaluation in one synthetic and four real-world labelled networks. Our experiments suggest that our proposed local structure measures are often more accurate for unsupervised pairwise comparison of ranking algorithms, especially when few example nodes are provided. Furthermore, the ability to reconstruct the extended neighborhood, which we call HopAUC, manages to select a near-best among many ranking algorithms in most networks.
Highlights
The nodes of complex networks are often organized into communities that mirror the systemic properties of their real-world counterparts
To create unsupervised procedures tailored to evaluating non-local metadata group communities, in our recent work (Krasanakis et al 2019a) we proposed that only highquality ranking algorithms can capture the relatedness of nodes to metadata groups that drive the formation of network edges
Comparing any pair of ranking algorithms We start by comparing the order of ranking algorithms arising from unsupervised measures vs. the ordering provided by Area Under Curve (AUC) and Normalized Discounted Cumulative gain (NDCG)
Summary
The nodes of complex networks are often organized into (overlapping) communities that mirror the systemic properties of their real-world counterparts. Researchers have theorized that this organization exhibits strong locality, i.e. that nodes with similar attributes are concentrated into small areas, and tried to discover structural ground truth communities whose nodes are tightly knit together (Fortunato and Hric 2016; Leskovec et al 2010; Xie et al 2013; Papadopoulos et al 2012). Metadata groups are organized into tightly knit structural communities. This happens only if they are correlated with the attributes most influencing the formation of network edges. Depending on the modeled attribute, metadata groups can be overlapping or span wide areas of a network, for example when social media users obtain multiple out of few available attributes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.