Abstract

AbstractWe describe a divide and conquer strategy for an exploratory data analysis (EDA) of large functional magnetic resonance imaging (fMRI) data sets. The need for an EDA to precede and complement a confirmatory model‐based analysis is now well established. For complex fMRI experiments, where a prior model of the expected response cannot be posited, the sole option is to conduct an initial EDA. An EDA often discovers unanticipated behavior, allowing the experimenter to augment or even change the original hypothesis. In addition, the gross artifact behavior that EDA makes evident may aid the experimenter in deciding whether the data set is even usable, some additional preprocessing step is required, or the one used has introduced spurious effects. The proposed strategy, named EROICA for exploring regions of interest with cluster analysis, evolved from an empirical observation that a typical cluster of activation or artifact time series can be partitioned into three subsets: time series corrupted by significant trends and time series above and below some noise level. Moreover, the sought‐after common temporal behavior among the cluster time series can be extracted in an uncorrupted form from the above noise level time series alone. Thus, the key feature of EROICA is the initial partition of the data set into trendy and below the noise level time series, followed by the fuzzy cluster analysis (FCA) of the above the noise level time series to extract common cluster behavior patterns (centroids). The initial partition is based on a test statistic in the power spectrum domain. This step has significant ramifications: it greatly speeds up the FCA because of the much smaller number of time series to cluster; it makes the clustering results more robust because they are no longer affected by the trendy and noisy time series; the above the noise level time series can be further grouped according to the location of the spectral peak on the frequency axis, and these groups can be used to create a subset of initial centroids that greatly improves the convergence rate of the FCA; and the group of below the noise level time series (referred to as the noise pool) can be used as a data‐driven representation of the underlying noise source. In the final step, each time series is modeled as a linear combination of the closest centroid plus noise. The noise pool is very convenient for obtaining thresholds when testing the significance of the model parameter without having to model or assume the distributional properties of the underlying noise source. To limit the number of false positives in the activation maps, the significance test also tests the time series power spectrum values at the frequency locations determined by the cluster centroid. EROICA is one of the analysis options offered by the general image‐processing package EvIdent®. © 2003 Wiley Periodicals, Inc. Concepts Magn Reson 16A: 50–62, 2003

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call