Abstract

Deep neural networks typically require large amounts of annotated data to be trained effectively. However, in several scientific disciplines, including medical image analysis, generating such large annotated datasets requires specialized domain knowledge, and hence is usually very expensive. In this work, we present a novel application of active learning to data sample selection for training Convolutional Neural Networks (CNN) for Cancerous Tissue Recognition (CTR). Our main idea is to steer annotation efforts towards selecting the most informative samples for training the CNN. To quantify informativeness, we explore three choices based on discrete entropy, best-vs-second-best, and k-nearest neighbor agreement. Our results on three different types of cancer datasets consistently demonstrate that under limited annotated samples, our proposed training scheme converges faster than classical randomized stochastic gradient descent, while achieving the same (or sometimes superior) classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call