Abstract
To identify neuroimaging biomarkers of alcohol dependence (AD) from structural magnetic resonance imaging, it may be useful to develop classification models that are explicitly generalizable to unseen sites and populations. This problem was explored in a mega‐analysis of previously published datasets from 2,034 AD and comparison participants spanning 27 sites curated by the ENIGMA Addiction Working Group. Data were grouped into a training set used for internal validation including 1,652 participants (692 AD, 24 sites), and a test set used for external validation with 382 participants (146 AD, 3 sites). An exploratory data analysis was first conducted, followed by an evolutionary search based feature selection to site generalizable and high performing subsets of brain measurements. Exploratory data analysis revealed that inclusion of case‐ and control‐only sites led to the inadvertent learning of site‐effects. Cross validation methods that do not properly account for site can drastically overestimate results. Evolutionary‐based feature selection leveraging leave‐one‐site‐out cross‐validation, to combat unintentional learning, identified cortical thickness in the left superior frontal gyrus and right lateral orbitofrontal cortex, cortical surface area in the right transverse temporal gyrus, and left putamen volume as final features. Ridge regression restricted to these features yielded a test‐set area under the receiver operating characteristic curve of 0.768. These findings evaluate strategies for handling multi‐site data with varied underlying class distributions and identify potential biomarkers for individuals with current AD.
Highlights
While the evidence associating alcohol dependence (AD) with structural brain differences is strong (Ewing, Sakhardande, & Blakemore, 2014; Fein et al, 2002; Yang et al, 2016), there is considerable merit in establishing robust and generalizable neuroimagingbased AD biomarkers (Mackey et al, 2019; Yip, Kiluk, & Scheinost, 2020)
Prior approaches to developing machine learning classifiers for AD include a similar binary machine learning classification approach discriminating between AD and substance naive controls (Guggenmos et al, 2018)
Mackey et al (2019) developed a support vector machine (SVM) classifier that obtained an average area under the receiver characteristic operator curve (AUC) of 0.76 on a subset of the training data presented within this work
Summary
While the evidence associating alcohol dependence (AD) with structural brain differences is strong (Ewing, Sakhardande, & Blakemore, 2014; Fein et al, 2002; Yang et al, 2016), there is considerable merit in establishing robust and generalizable neuroimagingbased AD biomarkers (Mackey et al, 2019; Yip, Kiluk, & Scheinost, 2020) These biomarkers would have objective utility for diagnosis and may help in identifying youth at risk for AD and for tracking recovery and treatment efficacy in abstinence, including relapse potential. A further example of recent work includes that by Adeli et al on distinguishing AD from controls (among other phenotypes), on a larger sample of 421, yielding a balanced accuracy across 10-fold CV of 70.1% (Adeli, 2019) In both examples, volumetric brain measures were extracted and used to train and evaluate proposed machine learning (ML) algorithms. A more detailed breakdown of the dataset by study and collection site is provided within the supplemental materials
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.