Abstract

We propose HookNet, a semantic segmentation model for histopathology whole-slide images, which combines context and details via multiple branches of encoder-decoder convolutional neural networks. Concentric patches at multiple resolutions with different fields of view, feed different branches of HookNet, and intermediate representations are combined via a hooking mechanism. We describe a framework to design and train HookNet for achieving high-resolution semantic segmentation and introduce constraints to guarantee pixel-wise alignment in feature maps during hooking. We show the advantages of using HookNet in two histopathology image segmentation tasks where tissue type prediction accuracy strongly depends on contextual information, namely (1) multi-class tissue segmentation in breast cancer and, (2) segmentation of tertiary lymphoid structures and germinal centers in lung cancer. We show the superiority of HookNet when compared with single-resolution U-Net models working at different resolutions as well as with a recently published multi-resolution model for histopathology image segmentation. We have made HookNet publicly available by releasing the source code1 as well as in the form of web-based applications2,3 based on the grand-challenge.org platform.

Highlights

  • Semantic image segmentation is the separation of concepts by grouping pixels belonging to the same concept, with the aim of simplifying image representation and understanding

  • This paper introduces HookNet, a multi-branch segmentation framework based on convolutional neural networks that can simultaneously incorporate contextual information and high-resolution details to produce fine-grained segmentation maps of histopathology images

  • Based on the improved F1 scores and confusion scores for tertiary lymphoid structures (TLS) and germinal centers (GC), we argue that HookNet can reduce the confusion between classes that are subjected to contextual information

Read more

Summary

Methods

In order to assess HookNet we compared it to five individual U-Net models trained with patches extracted at the following resolutions: 0.5, 1.0, 2.0, 4.0, and 8.0 μm/px. The models are expressed as U-Net(rt ) and HookNet(rt , rc), where rt and rc are the input resolutions for the target and context branch, respectively. The aim of HookNet is to output high-resolution segmentation maps, and thereupon the target branch will process input patches extracted at 0.5 μm/px. We extracted patches at the intermediate (2.0 μm/px) and extreme (8.0 μm/px) resolutions that were tested for the single-resolution models and showed potential value in single resolution performance measures. Only the intermediate resolution 2.0 μm/px showed potential value (see Table 3). In the HookNet models, ‘hooking’, from the context branch into the target branch took place at relative depths where the features maps of both branches comprise the same resolution, which is dependent on the input resolutions. Considering the target resolution 0.5 μm/px, we applied ‘hooking’ from depth 2 (the middle) of the context encoder and depth 0 (the end) of the context decoder into depth 4 (the start or bottleneck) of the target decoder, respectively for the context resolution 2.0 μm/px and 8.0 μm/px

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.