Abstract

An active learning algorithm is devised for training Self-Organizing Feature Maps on large data sets. Active learning algorithms recognize that not all exemplars are created equal. Thus, the concepts of exemplar age and difficulty are used to filter the original data set such that training epochs are only conducted over a small subset of the original data set. The ensuing Hierarchical Dynamic Subset Selection algorithm introduces definitions for exemplar difficulty suitable to an unsupervised learning context and therefore appropriate Self-organizing map (SOM) stopping criteria. The algorithm is benchmarked on several real world data sets with training set exemplar counts in the region of 30--500 thousand. Cluster accuracy is demonstrated to be at least as good as that from the original SOM algorithm while requiring a fraction of the computational overhead.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call