Abstract

Convolutional neural networks (CNNs) excel as powerful tools for biomedical image classification. It is commonly assumed that training CNNs requires large amounts of annotated data. This is a bottleneck in many medical applications where annotation relies on expert knowledge. Here, we analyze the binary classification performance of a CNN on two independent cytomorphology datasets as a function of training set size. Specifically, we train a sequential model to discriminate non-malignant leukocytes from blast cells, whose appearance in the peripheral blood is a hallmark of leukemia. We systematically vary training set size, finding that tens of training images suffice for a binary classification with an ROC-AUC over 90%. Saliency maps and layer-wise relevance propagation visualizations suggest that the network learns to increasingly focus on nuclear structures of leukocytes as the number of training images is increased. A low dimensional tSNE representation reveals that while the two classes are separated already for a few training images, the distinction between the classes becomes clearer when more training images are used. To evaluate the performance in a multi-class problem, we annotated single-cell images from a acute lymphoblastic leukemia dataset into six different hematopoietic classes. Multi-class prediction suggests that also here few single-cell images suffice if differences between morphological classes are large enough. The incorporation of deep learning algorithms into clinical practice has the potential to reduce variability and cost, democratize usage of expertise, and allow for early detection of disease onset and relapse. Our approach evaluates the performance of a deep learning based cytology classifier with respect to size and complexity of the training data and the classification task.

Highlights

  • Annotation is often expensive and time consuming, making generation of large high-quality datasets ­difficult[11,12]

  • Much previous work has focussed on the overall size of the data s­ ets[13], whereas we are interested in exploring the effect of training set sizes on the overall performance

  • To evaluate the robustness and generalizability of our results, an analogous analysis is performed with a much larger publicly available acute myeloid leukemia (AML) dataset containing more than 18,000 single-cell images, with very similar results

Read more

Summary

Introduction

Annotation is often expensive and time consuming, making generation of large high-quality datasets ­difficult[11,12]. Systematic training set variation has been previously analyzed for CT image ­classification[14]. First a small publicly available dataset containing 250 single-cell images of leukocytes of ALL patients and healthy individuals is used for CNN-based cell type classification. Both a binary and multiclass classifier are trained on this dataset and the binary classifier performance is evaluated with respect to the number of training images used. While increasing the size of training data in a systematic manner, we investigate the performance of our CNN and analyze the focus of the network as a function of the number of training images. To evaluate the robustness and generalizability of our results, an analogous analysis is performed with a much larger publicly available acute myeloid leukemia (AML) dataset containing more than 18,000 single-cell images, with very similar results

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.