Abstract

The out-of-domain (OO-Do) problem has hindered machine learning models especially when the models are deployed in the real world. The OO-Do problem occurs during machine learning testing phase when a learned machine learning model must predict on data belonging to a class that is different from that of the data used for training. We tackle the OO-Do problem in an object-detection task: a parasite-egg detection model used in real-world situations. First, we introduce the In-the-wild parasite-egg dataset to evaluate the OO-Do-aware model. The dataset contains 1,552 images, 1,049 parasite-egg, and 503 OO-Do images, uploaded through chatbot. It was constructed by conducting a chatbot test session with 222 medical technology students. Thereafter, we propose a data-driven framework to construct a parasite-egg recognition model for in-the-wild applications to address the OO-Do issue. In the framework, we use publicly available datasets to train the parasite-egg recognition models about in-domain and out-of-domain concepts. Finally, we compare the integration strategies for our proposed two-step parasite-egg detection approach on two test sets: standard and In-the-wild datasets. We also investigate different thresholding strategies for model robustness to OO-Do data. Experiments on two test datasets showed that concatenating an OO-Do-aware classification model after an object-detection model achieved outstanding performance in detecting parasite eggs. The framework gained 7.37% and 4.09% F1-score improvement from the baselines on Chulatest+WildOO−Do dataset and the In-the-wild parasite-egg dataset, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.