Optimal clustering from noisy binary feedback

Kaito Ariu,Alexandre Proutiere,Seyoung Yun,Jungseul Ok

doi:10.1007/s10994-024-06532-z

Abstract

AbstractWe study the problem of clustering a set of items from binary user feedback. Such a problem arises in crowdsourcing platforms solving large-scale labeling tasks with minimal effort put on the users. For example, in some of the recent reCAPTCHA systems, users clicks (binary answers) can be used to efficiently label images. In our inference problem, items are grouped into initially unknown non-overlapping clusters. To recover these clusters, the learner sequentially presents to users a finite list of items together with a question with a binary answer selected from a fixed finite set. For each of these items, the user provides a noisy answer whose expectation is determined by the item cluster and the question and by an item-specific parameter characterizing the hardness of classifying the item. The objective is to devise an algorithm with a minimal cluster recovery error rate. We derive problem-specific information-theoretical lower bounds on the error rate satisfied by any algorithm, for both uniform and adaptive (list, question) selection strategies. For uniform selection, we present a simple algorithm built upon the K-means algorithm and whose performance almost matches the fundamental limits. For adaptive selection, we develop an adaptive algorithm that is inspired by the derivation of the information-theoretical error lower bounds, and in turn allocates the budget in an efficient way. The algorithm learns to select items hard to cluster and relevant questions more often. We compare the performance of our algorithms with or without the adaptive selection strategy numerically and illustrate the gain achieved by being adaptive.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimal clustering from noisy binary feedback

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: Mar 22, 2024
License type: CC BY 4.0

Similar Papers

Runtime Contention and Bandwidth-Aware Adaptive Routing Selection Strategies for Networks-on-Chip
Faizal Arya Samman ... Manfred Glesner
IEEE Transactions on Parallel and Distributed Systems | VOL. 24
Faizal Arya Samman, et. al.Faizal Arya Samman ... Manfred Glesner
01 Jul 2013
IEEE Transactions on Parallel and Distributed Systems | VOL. 24

Two-Stage Adaptive Relay Selection and Power Allocation Strategy for Cooperative CR-NOMA Networks in Underlay Spectrum Sharing
Suoping Li ... Vicent Pla
Applied Sciences | VOL. 11
Suoping Li, et. al.Suoping Li ... Vicent Pla
06 Nov 2021
Applied Sciences | VOL. 11

POD-based model order reduction with an adaptive snapshot selection for a discontinuous Galerkin approximation of the time-domain Maxwell's equations
Kun Li ... Stéphane Lanteri
Journal of Computational Physics | VOL. 396
Kun Li, et. al.Kun Li ... Stéphane Lanteri
04 Jul 2019
Journal of Computational Physics | VOL. 396

Dynamic multiobjective evolutionary algorithm with adaptive response mechanism selection strategy
Liang Chen ... Tao Zhu
Knowledge-Based Systems | VOL. 246
Liang Chen, et. al.Liang Chen ... Tao Zhu
06 Apr 2022
Knowledge-Based Systems | VOL. 246

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimal clustering from noisy binary feedback

Abstract

Talk to us

Similar Papers

More From: Machine Learning