Abstract

In multilabel classification, each instance in the training set is associated with a set of labels, and the task is to output a label set whose size is unknown a priori for each unseen instance. The most commonly used approach for multilabel classification is where a binary classifier is learned independently for each possible class. However, multilabeled data generally exhibit relationships between labels, and this approach fails to take such relationships into account. In this paper, we describe an original method for multilabel classification problems derived from a Bayesian version of the k-nearest neighbor (k-NN) rule. The method developed here is an improvement on an existing method for multilabel classification, namely multilabel k-NN, which takes into account the dependencies between labels. Experiments on simulated and benchmark datasets show the usefulness and the efficiency of the proposed approach as compared to other existing methods.

Highlights

  • Traditional single-label classification assigns an object to exactly one class, from a set of Q disjoint classes

  • Each document may belong to multiple topics, such as arts and humanities [2,3,4,5]; in gene functional analysis, each gene may be associated with a set of functional classes, such as energy, metabolism, and cellular biogenesis [6]; in natural scene classification, each image may belong to several image types at the same time, such as sea and sunset [1]

  • The model parameters for DMLkNN are the number of neighbors k, the fuzziness parameter δ, and the smoothing parameter s

Read more

Summary

Introduction

Traditional single-label classification assigns an object to exactly one class, from a set of Q disjoint classes. In [10], the authors present a Bayesian multilabel knearest neighbor (MLkNN) approach where, in order to assign a set of labels to a new instance, a decision is made separately for each label by taking into account the number of neighbors containing the label to be assigned. This method fails to take into account the dependency between labels. We present a generalization of the MLkNNbased approach to multilabel classification problems where the dependencies between classes are considered We call this method DMLkNN, for dependent multilabel k-nearest Neighbor.

Related Work
Multilabel Classification
The DMLkNN Method for Multilabel Classification
Experiments
Prediction-Based Metrics
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call