Abstract

Clustering-based active learning approaches take advantage of the structure of the data to select representative instances. However, some algorithms are either inefficient or only applicable to some data. In this paper, we propose an effective and adaptive algorithm that will be called active learning through two-stage clustering (ALTA). The first stage is data preprocessing using the two-round-clustering algorithm to obtain $\sqrt n $ small blocks, where n is the number of instances. For each block, the closest instance of the center is selected as the representative. The second stage is the active learning of representative instances through density clustering. This stage consists of a number of iterations of density clustering, labeling and classification. In general, data preprocessing reduces the size of the data and the complexity of the algorithm. The combination of distance vector clustering and density clustering makes the algorithm more adaptive. Experiments are performed in comparison against the state-of-the-art active learning algorithms on nine datasets. Results demonstrate that the new algorithm has higher classification accuracy with the same number of labeled data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.