Abstract

<h3>Objective:</h3> The GECO algorithm removes from datasets spurious correlations which are too complex for human observation or statistical analysis to detect. We demonstrated the method’s efficacy in MRI images of the brain, leveraging generative techniques to maintain image quality while removing technical artifacts, and present GECO as a proof-of-concept for a more general approach to clearing complex, spurious correlations from many data types. <h3>Background:</h3> Machine learning models trained on imaging data have empirically shown an ability to detect complex and invisible artifacts with high accuracy, such as which type of machine a scan was taken from in the case of imaging. Such artifacts are potentially invisible to the human eye and statistical analysis, but can be identified by machine learning systems, leading them to focus on irrelevant features rather than scientifically and/or medically useful ones. Machine learning systems then often “shortcut” past the actual features researchers would like to detect and instead use unrelated, spurious correlations to make predictions. <h3>Design/Methods:</h3> GECO is Generative Adverserial Network designed for image-to-image translation, transforming an neuroimaging input image into a new image with user-selected spurious correlations removed. <h3>Results:</h3> Beginning with classifiers trained to identify images based on artifacts of interest in brain MRI images, GECO reduced the classifiers’ ability to detect these spurious correlations from 97% down to a difference which is nearly equal to a classifier making purely random guesses. We also observe over 98% structural similarity between the original and de-artifacted brain images, indicating the preservation of the vast majority of non-spurious information contained in the original images. <h3>Conclusions:</h3> In addition to solving the known problem of removing artifacts which hamper the analysis of brain MRI scans, the GECO algorithm opens the door to removing many other types of spurious correlations from both neuroimaging and a wide range of other data types in neurology and beyond. <b>Disclosure:</b> Mr. Bagley has received personal compensation for serving as an employee of Stanford University. Mr. Petrov has nothing to disclose. Miss Cheng has nothing to disclose. Mr. Armanasu has nothing to disclose. Dr. Fischbein has received publishing royalties from a publication relating to health care. Dr. Jiang has nothing to disclose. Dr. Iv has received personal compensation in the range of $500-$4,999 for serving as a Consultant for Octave Bioscience Inc. Dr. Iv has received personal compensation in the range of $500-$4,999 for serving as a Consultant for Hanalytics Pte. Ltd.. Dr. Iv has received personal compensation in the range of $0-$499 for serving as a Consultant for NordicImagingLab AS. Dr. Iv has stock in Octave Bioscience Inc.. Dr. Tranvinh has nothing to disclose. Michael Zeineh has nothing to disclose. Prof. Gevaert has received personal compensation in the range of $500-$4,999 for serving on a Scientific Advisory or Data Safety Monitoring board for AstraZeneca. Prof. Gevaert has received personal compensation in the range of $500-$4,999 for serving as an Editor, Associate Editor, or Editorial Advisory Board Member for Communications Medicine. The institution of Prof. Gevaert has received research support from Onc.AI. The institution of Prof. Gevaert has received research support from Roche Molecular Systems. The institution of Prof. Gevaert has received research support from Owkin.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call