AbstractBackgroundDue to the high dimensionality of Single Nucleotide Polymorphism (SNP) data, current imaging genetic studies of Dementia of Alzheimer’s Type (DAT) usually only select a limited number of candidate SNPs that are reported in the literature, while there might be potentially important AD‐related risk factors in the remaining genes. In this study, we harness a robust feature selection technique that utilizes all available SNPs in the human genome to identify the relevant MRI and genetic features while overcoming feature redundancy.MethodA total of 543 subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI) that had both MRI and genetic data available are included in the study. These subjects are divided into seven novel stratified groups [Table 1] based on their longitudinal clinical diagnosis. A subject is categorized as DAT+ if it has a follow‐up diagnosis of DAT, and is categorized as DAT‐ otherwise. We utilize a 2‐stage probabilistic multi‐kernel classifier. In the first stage, using 10‐fold cross‐validation: In each fold we select discriminative features by applying Fisher's Exact test on SNP data and Welch's t‐test on MRI data using subjects in the subgroups with the most certain longitudinal diagnosis (sNC in DAT‐ and sDAT in DAT+). In the second stage, using only the most frequently selected features above, we retrain on 80% of sNC and sDAT, validate on the remaining 20%, and test on subjects in the other subgroups with milder longitudinal diagnosis results.ResultOur study showed that although genetic features have a lower prediction power than MRI features, combining both modalities can improve the prediction of future conversion to AD. For subjects in pNC group, MRI features fail to indicate potential risk of developing AD while genetic data can accurately identify the risk [Figure 1]. Our feature selection method has successfully identified significant AD risk factors for both modalities.ConclusionOur study demonstrated that using effective feature selection methods eliminate the need for heuristic selection of AD‐related genes used for machine‐learning studies for image genetics. Our proposed method reveals genetic risk factors already mentioned in literature as well as some novel risk factors that could advance clinical diagnostics in the future.