Abstract

We have developed an automated system for the diagnosis of intestinal parasites from optical microscopy images. The objects (species of parasites and impurities) segmented from these images form a large dataset. We are interested in the active learning problem of selecting a reasonably small number of objects to be labeled under an expert׳s supervision for use in training a pattern classifier. However, impurities are very numerous, constitute several clusters in the feature space, and can be quite similar to some species of parasites, leading to a significant challenge for active learning methods. We propose a technique that pre-organizes the data and then properly balances the selection of samples from all classes and uncertain samples for training. Early data organization avoids reprocessing of the large dataset at each learning iteration, enabling the halting of sample selection after a desired number of samples per iteration, yielding interactive response time. We validate our method by comparing it with state-of-the-art approaches, using a previously labeled dataset of almost 6000 objects. Moreover, we report results from experiments on a very realistic scenario, consisting of a dataset with over 140,000 unlabeled objects, under unbalanced classes, the absence of some classes, and the presence of a very large set of impurities.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.