Evaluating the Impact of Retinal Vessel Segmentation Metrics on Retest Reliability in a Clinical Setting: A Comparative Analysis Using AutoMorph.

Samuel D Giesser,Ferhat Turgut,Amr Saad,Jay R Zoellin,Chiara Sommer,Yukun Zhou,Siegfried K Wagner,Pearse A Keane,Matthias Becker,Delia Cabrera Debuc,Gábor Márk Somfai

doi:10.1167/iovs.65.13.24

Abstract

Current research on artificial intelligence-based fundus photography biomarkers has demonstrated inconsistent results. Consequently, we aimed to evaluate and predict the test-retest reliability of retinal parameters extracted from fundus photography. Two groups of patients were recruited for the study: an intervisit group (n = 28) to assess retest reliability over a period of 1 to 14days and an intravisit group (n = 44) to evaluate retest reliability within a single session. Using AutoMorph, we generated test and retest vessel segmentation maps; measured segmentation map agreement via accuracy, sensitivity, F1 score and Jaccard index; and calculated 76 metrics from each fundus image. The retest reliability of each metric was analyzed in terms of the Spearman correlation coefficient, intraclass correlation coefficient (ICC), and relative percentage change. A linear model with the input variables contrast-to-noise-ratio and fractal dimension, chosen by a P-value-based backward selection process, was developed to predict the median percentage difference on retest per image based on image-quality metrics. This model was trained on the intravisit dataset and validated using the intervisit dataset. In the intervisit group, retest reliability varied between Spearman correlation coefficients of 0.34 and 0.99, ICC values of 0.31 to 0.99, and mean absolute percentage differences of 0.96% to 223.67%. Similarly, in the intravisit group, the retest reliability ranged from Spearman correlation coefficients of 0.55 and 0.96, ICC values of 0.40 to 0.97, and mean percentage differences of 0.49% to 371.23%. Segmentation map accuracy between test and retest never dropped below 97%; the mean F1 scores were 0.85 for the intravisit dataset and 0.82 for the intervisit dataset. The best retest was achieved with disc-width regarding the Spearman correlation coefficient in both datasets. In terms of the Spearman correlation coefficient, the worst retests of the intervisit and intravisit groups were tortuosity density and artery tortuosity density, respectively. The intravisit group exhibited better retest reliability than the intervisit group (P < 0.001). Our linear model, with the two independent variables contrast-to-noise ratio and fractal dimension predicted the median retest reliability per image on its validation dataset, the intervisit group, with an R2 of 0.53 (P < 0.001). Our findings highlight a considerable volatility in the reliability of some retinal biomarkers. Improving retest could allow disease progression modeling in smaller datasets or an individualized treatment approach. Image quality is moderately predictive of retest reliability, and further work is warranted to understand the reasons behind our observations better and thus ensure consistent retest results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating the Impact of Retinal Vessel Segmentation Metrics on Retest Reliability in a Clinical Setting: A Comparative Analysis Using AutoMorph.

Abstract

Talk to us

Similar Papers

More From: Investigative ophthalmology & visual science

Lead the way for us

Journal: Investigative ophthalmology & visual science	Publication Date: Nov 14, 2024
License type: CC BY 4.0

Similar Papers

Feasibility, test–retest reliability, and interrater reliability of the Modified Ashworth Scale and Modified Tardieu Scale in persons with profound intellectual and multiple disabilities
A Waninge ... C.P Van Der Schans
Research in Developmental Disabilities | VOL. 32
A Waninge, et. al.A Waninge ... C.P Van Der Schans
12 Jan 2011
Research in Developmental Disabilities | VOL. 32

Reliability and Validity of the English Version of the AOSpine PROST (Patient Reported Outcome Spine Trauma).
Said Sadiqi ... Lorin M Benneker
Spine | VOL. 45
Said Sadiqi, et. al.Said Sadiqi ... Lorin M Benneker
27 Apr 2020
Spine | VOL. 45

Author response: Robust group- but limited individual-level (longitudinal) reliability and insights into cross-phases response prediction of conditioned fear
Maren Klingelhöfer-Jens ... Tina B Lonsdorf
-
Maren Klingelhöfer-Jens, et. al.Maren Klingelhöfer-Jens ... Tina B Lonsdorf
26 Jul 2022
26 Jul 2022

Linguistic adaptation and validation of the voice handicap index (VHI)-30 in patients with dysphonia into Russian
M.A Krishtopova ... L.G Petrova
Vestnik otorinolaringologii | VOL. 86
M.A Krishtopova, et. al.M.A Krishtopova ... L.G Petrova
01 Jan 2020
Vestnik otorinolaringologii | VOL. 86

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating the Impact of Retinal Vessel Segmentation Metrics on Retest Reliability in a Clinical Setting: A Comparative Analysis Using AutoMorph.

Abstract

Talk to us

Similar Papers

More From: Investigative ophthalmology & visual science