Abstract

BackgroundIn recent years there is increasing attention to artificial intelligence (AI) and machine learning (ML) in the diagnosis and treatment of atrial fibrillation (AF). However, AI approaches carry the risk of algorithmic bias from nonrepresentative training sets that may yield harm to populations.ObjectiveEvaluate the use of descriptive data representing individual, social, and structural variables in AI research related to the diagnosis and treatment of AF.MethodsBibliometric analysis of all papers yielded by a PubMed search using terms (atrial fibrillation) AND ((artificial intelligence) OR (machine learning)) AND (algorithm) AND (train*). Non-English papers were reviewed using Google Translate. Full text was reviewed for all papers.Results174 papers were reviewed. Excluded 37 (21%) that were review articles, non-human studies, or did not describe research related to AF diagnostics/treatment. Of the remaining 137 papers, 78 (57%) described only the rhythm status of the training set; 48 (35%) described additional individual level information (age/gender most common, comorbidities less common); 4 (3%) described social determinants (e.g. tobacco, alcohol, home address); 2 (1.4%) described structural determinants (i.e. insurance status). Structural determinants were inferrable for additional papers describing single-center studies in countries with universal health care systems. Limitations included single reviewer, raising the possibility of misinterpretation of some data sets.ConclusionAt present, AI research in AF rarely incorporates patient metrics relevant to determine the representativeness of training sets for broad application. In light of historical examples demonstrating the risk of population harm from research generated by nonrepresentative samples, future AI research should include explicit accounting of individual, social, and structural determinant factors in training sets to mitigate against algorithmic bias and avoid population harm. BackgroundIn recent years there is increasing attention to artificial intelligence (AI) and machine learning (ML) in the diagnosis and treatment of atrial fibrillation (AF). However, AI approaches carry the risk of algorithmic bias from nonrepresentative training sets that may yield harm to populations. In recent years there is increasing attention to artificial intelligence (AI) and machine learning (ML) in the diagnosis and treatment of atrial fibrillation (AF). However, AI approaches carry the risk of algorithmic bias from nonrepresentative training sets that may yield harm to populations. ObjectiveEvaluate the use of descriptive data representing individual, social, and structural variables in AI research related to the diagnosis and treatment of AF. Evaluate the use of descriptive data representing individual, social, and structural variables in AI research related to the diagnosis and treatment of AF. MethodsBibliometric analysis of all papers yielded by a PubMed search using terms (atrial fibrillation) AND ((artificial intelligence) OR (machine learning)) AND (algorithm) AND (train*). Non-English papers were reviewed using Google Translate. Full text was reviewed for all papers. Bibliometric analysis of all papers yielded by a PubMed search using terms (atrial fibrillation) AND ((artificial intelligence) OR (machine learning)) AND (algorithm) AND (train*). Non-English papers were reviewed using Google Translate. Full text was reviewed for all papers. Results174 papers were reviewed. Excluded 37 (21%) that were review articles, non-human studies, or did not describe research related to AF diagnostics/treatment. Of the remaining 137 papers, 78 (57%) described only the rhythm status of the training set; 48 (35%) described additional individual level information (age/gender most common, comorbidities less common); 4 (3%) described social determinants (e.g. tobacco, alcohol, home address); 2 (1.4%) described structural determinants (i.e. insurance status). Structural determinants were inferrable for additional papers describing single-center studies in countries with universal health care systems. Limitations included single reviewer, raising the possibility of misinterpretation of some data sets. 174 papers were reviewed. Excluded 37 (21%) that were review articles, non-human studies, or did not describe research related to AF diagnostics/treatment. Of the remaining 137 papers, 78 (57%) described only the rhythm status of the training set; 48 (35%) described additional individual level information (age/gender most common, comorbidities less common); 4 (3%) described social determinants (e.g. tobacco, alcohol, home address); 2 (1.4%) described structural determinants (i.e. insurance status). Structural determinants were inferrable for additional papers describing single-center studies in countries with universal health care systems. Limitations included single reviewer, raising the possibility of misinterpretation of some data sets. ConclusionAt present, AI research in AF rarely incorporates patient metrics relevant to determine the representativeness of training sets for broad application. In light of historical examples demonstrating the risk of population harm from research generated by nonrepresentative samples, future AI research should include explicit accounting of individual, social, and structural determinant factors in training sets to mitigate against algorithmic bias and avoid population harm. At present, AI research in AF rarely incorporates patient metrics relevant to determine the representativeness of training sets for broad application. In light of historical examples demonstrating the risk of population harm from research generated by nonrepresentative samples, future AI research should include explicit accounting of individual, social, and structural determinant factors in training sets to mitigate against algorithmic bias and avoid population harm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call