Abstract

Automatic annotation is an essential technique for effectively handling and organizing Web objects (e.g., Web pages), which have experienced an unprecedented growth over the last few years. Automatic annotation is usually formulated as a multi-label classification problem. Unfortunately, labeled data are often time-consuming and expensive to obtain. Web data also accommodate much richer feature space. This calls for new semi-supervised approaches that are less demanding on labeled data to be effective in classification. In this paper, we propose a graph-based semi-supervised learning approach that leverages random walks and l1 sparse reconstruction on a mixed object-label graph with both attribute and structure information for effective multi-label classification. The mixed graph contains an object-affinity subgraph, a label-correlation subgraph, and object-label edges with adaptive weight assignments indicating the assignment relationships. The object-affinity subgraph is constructed using l1 sparse graph reconstruction with extracted structural meta-text, while the label-correlation subgraph captures pairwise correlations among labels via linear combination of their co-occurrence similarity and kernel-based similarity. A random walk with adaptive weight assignment is then performed on the constructed mixed graph to infer probabilistic assignment relationships between labels and objects. Extensive experiments on real Yahoo! Web datasets demonstrate the effectiveness of our approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call