Segmentation evaluation with sparse ground truth data: Simulating true segmentations as perfect/imperfect as those generated by humans.

Jieyu Li,Drew A Torigian,Lisheng Wang,Jayaram K Udupa,Yubing Tong

doi:10.1016/j.media.2021.101980

Jieyu Li, Drew A Torigian + Show 3 more

Open Access

https://doi.org/10.1016/j.media.2021.101980

Copy DOI

Abstract

Fully annotated data sets play important roles in medical image segmentation and evaluation. Expense and imprecision are the two main issues in generating ground truth (GT) segmentations. In this paper, in an attempt to overcome these two issues jointly, we propose a method, named SparseGT, which exploit variability among human segmenters to maximally save manual workload in GT generation for evaluating actual segmentations by algorithms. Pseudo ground truth (p-GT) segmentations are created by only a small fraction of workload and with human-level perfection/imperfection, and they can be used in practice as a substitute for fully manual GT in evaluating segmentation algorithms at the same precision. p-GT segmentations are generated by first selecting slices sparsely, where manual contouring is conducted only on these sparse slices, and subsequently filling segmentations on other slices automatically. By creating p-GT with different levels of sparseness, we determine the largest workload reduction achievable for each considered object, where the variability of the generated p-GT is statistically indistinguishable from inter-segmenter differences in full manual GT segmentations for that object. Furthermore, we investigate the segmentation evaluation errors introduced by variability in manual GT by applying p-GT in evaluation of actual segmentations by an algorithm. Experiments are conducted on ∼500 computed tomography (CT) studies involving six objects in two body regions, Head & Neck and Thorax, where optimal sparseness and corresponding evaluation errors are determined for each object and each strategy. Our results indicate that creating p-GT by the concatenated strategy of uniformly selecting sparse slices and filling segmentations via deep-learning (DL) network show highest manual workload reduction by ∼80-96% without sacrificing evaluation accuracy compared to fully manual GT. Nevertheless, other strategies also have obvious contributions in different situations. A non-uniform strategy for slice selection shows its advantage for objects with irregular shape change from slice to slice. An interpolation strategy for filling segmentations can achieve ∼60-90% of workload reduction in simulating human-level GT without the need of an actual training stage and shows potential in enlarging data sets for training p-GT generation networks. We conclude that not only over 90% reduction in workload is feasible without sacrificing evaluation accuracy but also the suitable strategy and the optimal sparseness level achievable for creating p-GT are object- and application-specific.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Segmentation evaluation with sparse ground truth data: Simulating true segmentations as perfect/imperfect as those generated by humans.

Abstract

Talk to us

Similar Papers

More From: Medical image analysis

Lead the way for us

Journal: Medical image analysis	Publication Date: Jan 26, 2021
Citations: 6

Similar Papers

Can Ground Truth Label Propagation from Video Help Semantic Segmentation?
Siva Karthik Mustikovela ... Carsten Rother
-
Siva Karthik Mustikovela, et. al.Siva Karthik Mustikovela ... Carsten Rother
01 Jan 2015
01 Jan 2015

Anatomy segmentation evaluation with sparse ground truth data
Jieyu Li ... Yubing Tong
-
Jieyu Li, et. al.Jieyu Li ... Yubing Tong
10 Mar 2020
10 Mar 2020

Weakly-supervised object detection via mining pseudo ground truth bounding-boxes
Yongqiang Zhang ... Bernard Ghanem
Pattern Recognition | VOL. 84
Yongqiang Zhang, et. al.Yongqiang Zhang ... Bernard Ghanem
05 Jul 2018
Pattern Recognition | VOL. 84

Clinical Acceptability of Automatically Generated Elective Lymph Node Volumes for Head and Neck Cancer Patients
S Maroongroge ... T Netherton
International Journal of Radiation Oncology*Biology*Physics | VOL. 117
S Maroongroge, et. al.S Maroongroge ... T Netherton
29 Sep 2023
International Journal of Radiation Oncology*Biology*Physics | VOL. 117

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Segmentation evaluation with sparse ground truth data: Simulating true segmentations as perfect/imperfect as those generated by humans.

Abstract

Talk to us

Similar Papers

More From: Medical image analysis