Spectrum bias in algorithms derived by artificial intelligence: a case study in detecting aortic stenosis using electrocardiograms.

Andrew S Tseng,Paul A Friedman,Michal Shelly-Cohen,Itzhak Z Attia,Jae K Oh,Francisco Lopez-Jimenez,Peter A Noseworthy

doi:10.1093/ehjdh/ztab061

Andrew S Tseng, Paul A Friedman + Show 5 more

Open Access

https://doi.org/10.1093/ehjdh/ztab061

Copy DOI

Journal: European Heart Journal - Digital Health	Publication Date: Jul 14, 2021
Citations: 9	License type: CC BY-NC 4.0

Affiliation: Mayo Clinic

Abstract

Spectrum bias can arise when a diagnostic test is derived from study populations with different disease spectra than the target population, resulting in poor generalizability. We used a real-world artificial intelligence (AI)-derived algorithm to detect severe aortic stenosis (AS) to experimentally assess the effect of spectrum bias on test performance. All adult patients at the Mayo Clinic between 1 January 1989 and 30 September 2019 with transthoracic echocardiograms within 180 days after electrocardiogram (ECG) were identified. Two models were developed from two distinct patient cohorts: a whole-spectrum cohort comparing severe AS to any non-severe AS and an extreme-spectrum cohort comparing severe AS to no AS at all. Model performance was assessed. Overall, 258 607 patients had valid ECG and echocardiograms pairs. The area under the receiver operator curve was 0.87 and 0.91 for the whole-spectrum and extreme-spectrum models, respectively. Sensitivity and specificity for the whole-spectrum model was 80% and 81%, respectively, while for the extreme-spectrum model it was 84% and 84%, respectively. When applying the AI-ECG derived from the extreme-spectrum cohort to patients in the whole-spectrum cohort, the sensitivity, specificity, and area under the curve dropped to 83%, 73%, and 0.86, respectively. While the algorithm performed robustly in identifying severe AS, this study shows that limiting datasets to clearly positive or negative labels leads to overestimation of test performance when testing an AI algorithm in the setting of classifying severe AS using ECG data. While the effect of the bias may be modest in this example, clinicians should be aware of the existence of such a bias in AI-derived algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spectrum bias in algorithms derived by artificial intelligence: a case study in detecting aortic stenosis using electrocardiograms.

Abstract

Talk to us

Similar Papers

More From: European Heart Journal - Digital Health

Lead the way for us

Similar Papers

Aortic Stenosis Severity: Rhythm Makes a Difference
Lucas Wang ... Subhash Banerjee
CASE | VOL. 6
Lucas Wang, et. al.Lucas Wang ... Subhash Banerjee
23 Jun 2022
CASE | VOL. 6

Ethical limitations of algorithmic fairness solutions in health care machine learning
Melissa D Mccradden ... James A Anderson
The Lancet Digital Health | VOL. 2
Melissa D Mccradden, et. al.Melissa D Mccradden ... James A Anderson
28 Apr 2020
The Lancet Digital Health | VOL. 2

Abstract 13645: Prognostic Value of Aortic Valve Calcification in Asymptomatic Patients With Non-Severe Aortic Valve Stenosis and Preserved Ejection Fraction
Zi Ye ... Thomas A Foley
Circulation | VOL. 148
Zi Ye, et. al.Zi Ye ... Thomas A Foley
07 Nov 2023
Circulation | VOL. 148

Transcatheter Aortic Valve Replacement in Patients With Reduced Ejection Fraction and Nonsevere Aortic Stenosis.
Sebastian Ludwig ...
Circulation. Cardiovascular interventions | VOL. 16
Sebastian Ludwig, et. al.Sebastian Ludwig ...
01 May 2023
Circulation. Cardiovascular interventions | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spectrum bias in algorithms derived by artificial intelligence: a case study in detecting aortic stenosis using electrocardiograms.

Abstract

Talk to us

Similar Papers

More From: European Heart Journal - Digital Health