Similarity Search for Efficient Active Learning and Search of Rare Concepts

Cody Coleman,Robert Nowak,Julian Katz-Samuels,Peter Bailis,Matei Zaharia,Alexander C Berg,Edward Chou,Roshan Sumbaly,Sean Culatana,I Zeki Yalniz

doi:10.1609/aaai.v36i6.20591

Abstract

Many active learning and search approaches are intractable for large-scale industrial settings with billions of unlabeled examples. Existing approaches search globally for the optimal examples to label, scaling linearly or even quadratically with the unlabeled data. In this paper, we improve the computational efficiency of active learning and search methods by restricting the candidate pool for labeling to the nearest neighbors of the currently labeled set instead of scanning over all of the unlabeled data. We evaluate several selection strategies in this setting on three large-scale computer vision datasets: ImageNet, OpenImages, and a de-identified and aggregated dataset of 10 billion publicly shared images provided by a large internet company. Our approach achieved similar mAP and recall as the traditional global approach while reducing the computational cost of selection by up to three orders of magnitude, enabling web-scale active learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Similarity Search for Efficient Active Learning and Search of Rare Concepts

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 9

Similar Papers

Active Learning for Multivariate Time Series Classification with Positive Unlabeled Data
Guoliang He ... Xiangyang Jia
-
Guoliang He, et. al.Guoliang He ... Xiangyang Jia
01 Nov 2015
01 Nov 2015

Active semi-supervised learning for biological data classification.
Guilherme Camargo ... Pedro H Bugatti
PLOS ONE | VOL. 15
Guilherme Camargo, et. al.Guilherme Camargo ... Pedro H Bugatti
19 Aug 2020
PLOS ONE | VOL. 15

Efficient Active Learning by Querying Discriminative and Representative Samples and Fully Exploiting Unlabeled Data.
Bin Gu ... Cheng Deng
IEEE Transactions on Neural Networks and Learning Systems | VOL. 32
Bin Gu, et. al.Bin Gu ... Cheng Deng
26 Aug 2020
IEEE Transactions on Neural Networks and Learning Systems | VOL. 32

Research on Query-by-Committee Method of Active Learning and Application
Yue Zhao ... Yongcun Cao
-
Yue Zhao, et. al.Yue Zhao ... Yongcun Cao
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Similarity Search for Efficient Active Learning and Search of Rare Concepts

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence