Current neuroimaging studies frequently use complex machine learning models to classify human fMRI data, distinguishing healthy and disordered brains, often to validate new methods or enhance prediction accuracy. Yet, where prediction accuracy is a concern, our results suggest that precision in prediction does not always require such sophistication. When a classifier as simple as logistic regression is applied to feature-engineered fMRI data, it can match or even outperform more sophisticated recent models. Classification of the raw time series fMRI data generally benefits from complex parameter-rich models. However, this complexity often pushes them into the class of black-box models. Yet, we found that a relatively simple model can consistently outperform much more complex classifiers in both accuracy and speed. This model applies the same multi-layer perceptron repeatedly across time and averages the results. Thus, the complexity and black-box nature of the parameter rich models, often perceived as a necessary trade-off for higher performance, do not invariably yield superior results on fMRI.Given the success of straightforward approaches, we challenge the merit of research that concentrates solely on complex model development driven by classification. Instead, we advocate for increased focus on designing models that prioritize the explainability of fMRI data or pursue applicable objectives beyond mere classification accuracy, unless they significantly outperform logistic regression or our proposed model. To validate our claim, we explore possible reasons for the superior performance of our straightforward model by examining the innate characteristics of fMRI time series data. Our findings suggest that the sequential information hidden in the temporal order may be far less important for the accurate fMRI classification than the stand-alone pieces of information scattered across the frames of the time series.
Read full abstract