Abstract

There are many situations in supervised learning where the acquisition of data is very expensive and sometimes determined by a user’s budget. One way to address this limitation is active learning. In this study, we focus on a fixed budget regime and propose a novel active learning algorithm for the pool-based active learning problem. The proposed method performs active learning with a pre-trained acquisition function so that the maximum performance can be achieved when the number of data that can be acquired is fixed. To implement this active learning algorithm, the proposed method uses reinforcement learning based on deep neural networks as as a pre-trained acquisition function tailored for the fixed budget situation. By using the pre-trained deep Q-learning-based acquisition function, we can realize the active learner which selects a sample for annotation from the pool of unlabeled samples taking the fixed-budget situation into account. The proposed method is experimentally shown to be comparable with or superior to existing active learning methods, suggesting the effectiveness of the proposed approach for the fixed-budget active learning.

Highlights

  • In the framework of supervised learning, the predictive performance of a learned model should improve when the number of samples increases

  • We compare the active learning methods with the oracle data selector to demonstrate that the proposed method considers the context of data selection

  • We proposed an active learning method suitable for a fixed budget regime

Read more

Summary

Introduction

In the framework of supervised learning, the predictive performance of a learned model should improve when the number of samples increases. Suppose we want to learn a prediction model when a budget is fixed in advance, namely, the number of data to be labeled is pre-determined, which is an extremely common situation when developing a machine learning system In this case, a better model should be obtained if we acquire the labeled data in a manner that considers the data acquisition order or context within the budget. Considering the context within which the data is acquired, data is selected according to an appropriate criterion that reflects the current state of the learning model so that the model performance is maximized when the specified number of data is acquired For this purpose, a deep Q-network (DQN) [32] is used to learn an acquisition function that takes the data context into account. The last section is devoted to the discussion and conclusion

Related Work
Preliminary for Reinforcement Learning
Q-Learning
Deep Q-Network
Proposed Method
Design of the State
Design of the Action
Design of Reward
Advantage of the Proposed Method
Experiments
2: Split D to training dataset and pool dataset
Dependence on the pre-training dataset
Experiments with Real-World datasets
Method
Computational Costs
Conclusion and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.