Interactive classification aims at introducing user preferences in the learning process to produce individualized outcomes more adapted to each user’s behavior than the fully automatic approaches. The current interactive classification systems generally adopt a single-label classification paradigm that constrains items to span one label at a time and consequently limit the user’s expressiveness while he/she interacts with data that are inherently multi-label. Moreover, the experimental evaluations are mainly subjective and closely depend on the targeted use cases and the interface characteristics. This paper presents the first extensive study of the impact of the interactivity constraints on the performances of a large set of twelve well-established multi-label learning methods. We restrict ourselves to the evaluation of the classifier predictive and time-computation performances while the number of training examples regularly increases and we focus on the beginning of the classification task where few examples are available. The classifier performances are evaluated with an experimental protocol independent of any implementation environment on a set of twelve multi-label benchmarks of various sizes from different domains. Our comparison shows that four classifiers can be distinguished for the prediction quality: RF-PCT (Random Forest of Predictive Clustering Trees, (Kocev, 2011)), EBR (Ensemble of Binary Relevance, (Read et al., 2011)), CLR (Calibrated Label Ranking, (Fürnkranz et al., 2008)) and MLkNN (Multi-label kNN, (Zhang and Zhou, 2007)) with an advantage for the first two ensemble classifiers. Moreover, only RF-PCT competes with the fastest classifiers and is therefore considered as the most promising classifier for an interactive multi-label learning system.