The emergence of computational pathology comes with a demand to extract more and more information from each tissue sample. Such information extraction often requires the segmentation of numerous histological objects (e.g., cell nuclei, glands, etc.) in histological slide images, a task for which deep learning algorithms have demonstrated their effectiveness. However, these algorithms require many training examples to be efficient and robust. For this purpose, pathologists must manually segment hundreds or even thousands of objects in histological images, i.e., a long, tedious and potentially biased task. The present paper aims to review strategies that could help provide the very large number of annotated images needed to automate the segmentation of histological images using deep learning. This review identifies and describes four different approaches: the use of immunohistochemical markers as labels, realistic data augmentation, Generative Adversarial Networks (GAN), and transfer learning. In addition, we describe alternative learning strategies that can use imperfect annotations. Adding real data with high-quality annotations to the training set is a safe way to improve the performance of a well configured deep neural network. However, the present review provides new perspectives through the use of artificially generated data and/or imperfect annotations, in addition to transfer learning opportunities.
Read full abstract