Abstract

While training predict models to assess new data, relative researchers always incline to design some strategies to reduce amount of selected sampling points to save cost and time, especially when information about these sampling points is hard to get. Among these strategies, active learning is a popular and useful tool to enhance the efficiency of training predict model by improving the quality of sampling points selected. In biological field, making specific experiments to get result could cost some time and budgets, which fits the situation exactly. While when the information on each data points is limited, active learning method is relatively hard to apply. Thus, in order to relieve this situation, this paper attempts to solve the problem by optimizing the structure of datasets. In the paper, a specific dataset is selected to test the performance of several traditional active learning methods. Meanwhile, a special trick which aims to optimize the configuration of data space is proposed to enhance the performance of both predict models and active learning methods. According to experiments, it turns out that the optimization on data space could let the predict model fit better to the datasets and could help enhance the effect on active learning methods, which has performance enhancement of 5%~22% during the process (5%~20%) of training predict model. By combining with traditional active learning method, the increment could be risen up to 9%~32% under the same progress (5%, 10%, 15%, 20%), which stands for the percentage of data used to train predict model in all the dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call