BackgroundImage segmentation pipelines are commonly used in microscopy to identify cellular compartments like nucleus and cytoplasm, but there are few standards for comparing segmentation accuracy across pipelines. The process of selecting a segmentation assessment pipeline can seem daunting to researchers due to the number and variety of metrics available for evaluating segmentation quality.ResultsHere we present automated pipelines to obtain a comprehensive set of 69 metrics to evaluate segmented data and propose a selection methodology for models based on quantitative analysis, dimension reduction or unsupervised classification techniques and informed selection criteria.ConclusionWe show that the metrics used here can often be reduced to a small number of metrics that give a more complete understanding of segmentation accuracy, with different groups of metrics providing sensitivity to different types of segmentation error. These tools are delivered as easy to use python libraries, command line tools, Common Workflow Language Tools, and as Web Image Processing Pipeline interactive plugins to ensure a wide range of users can access and use them. We also present how our evaluation methods can be used to observe the changes in segmentations across modern machine learning/deep learning workflows and use cases.
Read full abstract