Interobserver Agreement among Multiple Generalists is Comparable to that of Recognized Experts: Prospective Acceptability Benchmarks from the C3RO Crowdsourced Initiative

D Lin,K.A Wahid,B Nelms,R He,M Naser,S.L Duke,M Sherer,M Cislo,J.D Murphy,E.F Gillespie,C.D Fuller

doi:10.1016/j.ijrobp.2022.07.415

Abstract

<h3>Purpose/Objective(s)</h3> Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a public crowdsourced challenge engaging radiation oncologists across various expertise levels in cloud-based image-segmentation. A challenge in artificial intelligence (AI) development is the relative paucity of multi-expert observer datasets sufficiently large to train deep learning; consequently, we sought to characterize whether aggregate segmentations generated from large numbers of generalists could meet or exceed expert interobserver agreement, the current "gold standard." <h3>Materials/Methods</h3> Participants who contoured at least one region of interest (ROI) for the C3RO breast or sarcoma challenge were identified as generalist or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations. STAPLE<sub>generalist</sub> ROIs were evaluated against STAPLE<sub>expert</sub> contours using Dice Similarity Coefficient (DSC). The expert interobserver DSC (IODSC<sub>expert</sub>) (i.e., pairwise median DSC across experts) was calculated as a performance acceptability threshold between STAPLE<sub>generalist</sub> and STAPLE<sub>expert</sub>. To determine the number of generalists required to match the IODSC<sub>expert</sub> for each ROI, a single STAPLE<sub>bootstrap</sub> consensus contour was generated for a 10-fold random-bootstrap using a variable number of generalists (between 2-25) and then compared to the IODSC<sub>expert</sub>. <h3>Results</h3> The breast challenge yielded contours from 124 generalists and 8 experts. The DSC for STAPLE<sub>generalist</sub> versus STAPLE<sub>expert</sub> were higher than their respective expert IODSC<sub>expert</sub> for all ROIs, including the axilla (STAPLE<sub>generalist</sub>/IODSC<sub>expert</sub>, 0.86/0.68), chest wall (0.91/0.67), heart (0.97/0.9), supraclavicular nodes (0.77/0.57), internal mammary nodes (0.66/0.46), left brachial plexus (0.46/0.2), and left anterior descending artery (0.62/0.32). The sarcoma case had contours from 61 generalists and 4 experts. The DSC between STAPLE<sub>generalist</sub> and STAPLE<sub>expert</sub> were higher than their respective IODSC<sub>expert</sub> for GTV (0.97/0.94) and CTV (0.76/0.69), but not genitalia (0.60/0.66). The theoretical minimum number of generalist segmentation needed to cross the IODSC<sub>expert</sub> acceptability threshold ranged between 2-4 for breast and 2-5 for sarcoma ROIs. <h3>Conclusion</h3> Multi-generalist-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Analyses suggest that 5+ generalists could potentially generate consensus ROIs with DSC performance approximating an individual expert, suggesting multi-generalist segmentations as a feasible AI input. Future research will explore whether these observations are site-specific and/or generalizable to more granular surface metrics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interobserver Agreement among Multiple Generalists is Comparable to that of Recognized Experts: Prospective Acceptability Benchmarks from the C3RO Crowdsourced Initiative

Abstract

Talk to us

Similar Papers

More From: International Journal of Radiation OncologyBiologyPhysics

Lead the way for us

Similar Papers

Interobserver agreement among multiple generalists or specialists are comparable to that of recognized experts: Prospective acceptability benchmarks for H&N from the C3RO crowdsourced initiative
D Lin ... C.D Fuller
International Journal of Radiation Oncology, Biology, Physics | VOL. 114
D Lin, et. al.D Lin ... C.D Fuller
11 Aug 2022
International Journal of Radiation Oncology, Biology, Physics | VOL. 114

E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation.
Diana Lin ... James D Murphy
Journal of Medical Imaging | VOL. 10
Diana Lin, et. al.Diana Lin ... James D Murphy
08 Feb 2023
Journal of Medical Imaging | VOL. 10

Variability of the depth of supraclavicular and axillary lymph nodes in patients with breast cancer: is a posterior axillary boost field necessary?
Gunilla C Bentel ... Leonard R Prosnitz
International Journal of Radiation Oncology, Biology, Physics | VOL. 47
Gunilla C Bentel, et. al.Gunilla C Bentel ... Leonard R Prosnitz
31 May 2000
International Journal of Radiation Oncology, Biology, Physics | VOL. 47

SU‐GG‐J‐108: Validation Study of a Software Tool for Consensus Analysis of Experts' Contours for Generating Atlases of Radiotherapy Target and Normal Structures
R Al‐Lozi ... I El Naqa
Medical Physics | VOL. 37
R Al‐Lozi, et. al.R Al‐Lozi ... I El Naqa
01 Jun 2010
Medical Physics | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interobserver Agreement among Multiple Generalists is Comparable to that of Recognized Experts: Prospective Acceptability Benchmarks from the C3RO Crowdsourced Initiative

Abstract

Talk to us

Similar Papers

More From: International Journal of Radiation Oncology*Biology*Physics

More From: International Journal of Radiation OncologyBiologyPhysics