Abstract

The most commonly used active learning criterion is uncertainty sampling, where a supervised model is used to predict uncertain samples that should be labeled next by a human annotator. When using active learning, two problems are encountered: First, the supervised model has a cold-start problem in the beginning of the active learning process, and second, the human annotators might make labeling mistakes. In my Ph.D. research, I address the two problems with the development of an unsupervised method for the computation of informative samples. The informative samples are first manually labeled and then used for both: The training of the initial active learning model and the training of the human annotators in form of a learning-by-doing session. The planned unsupervised method will be based on word-embeddings and it will be limited to the area of text classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call