Exploring the impact of analysis software on task fMRI results.

Alexander Bowring,Thomas E Nichols,Camille Maumet

doi:10.1002/hbm.24603

Abstract

A wealth of analysis tools are available to fMRI researchers in order to extract patterns of task variation and, ultimately, understand cognitive function. However, this “methodological plurality” comes with a drawback. While conceptually similar, two different analysis pipelines applied on the same dataset may not produce the same scientific results. Differences in methods, implementations across software, and even operating systems or software versions all contribute to this variability. Consequently, attention in the field has recently been directed to reproducibility and data sharing. In this work, our goal is to understand how choice of software package impacts on analysis results. We use publicly shared data from three published task fMRI neuroimaging studies, reanalyzing each study using the three main neuroimaging software packages, AFNI, FSL, and SPM, using parametric and nonparametric inference. We obtain all information on how to process, analyse, and model each dataset from the publications. We make quantitative and qualitative comparisons between our replications to gauge the scale of variability in our results and assess the fundamental differences between each software package. Qualitatively we find similarities between packages, backed up by Neurosynth association analyses that correlate similar words and phrases to all three software package's unthresholded results for each of the studies we reanalyse. However, we also discover marked differences, such as Dice similarity coefficients ranging from 0.000 to 0.684 in comparisons of thresholded statistic maps between software. We discuss the challenges involved in trying to reanalyse the published studies, and highlight our efforts to make this research reproducible.

Highlights

Functional Magnetic Resonance Imaging for human brain mapping gives researchers remarkable power to probe the underpinnings of human cognition, behavior and emotion
Variability in T-statistic values and locations of significant activation was substantial between software packages across all three studies
Across all three of the studies reanalysed here we have discovered considerable differences between the AFNI, FSL, and SPM results

Summary

Introduction

Functional Magnetic Resonance Imaging (fMRI) for human brain mapping gives researchers remarkable power to probe the underpinnings of human cognition, behavior and emotion. In related work (Glatard et al, 2015), changes in operating system lead to differences in the results of an independent component analysis of resting state fMRI data carried out using FSL Disparities in both the number of components determined as well as information between matched components were found when the analysis was conducted on two separate computing clusters. In perhaps the most comprehensive of such studies (Carp, 2012a), a single publicly available fMRI dataset was analyzed using over 6,000 unique analysis pipelines, generating 34,560 unique thresholded activation images. These results displayed a substantial degree of flexibility in both the sizes and locations of significant activation. These examples of research shape a sombre picture for the possibility of study reproducibility

Objectives

Methods

Results

Discussion

Conclusion