The traditional analytical framework taken by neuroimaging studies in general, and lesion-behavior studies in particular, has been inferential in nature and has focused on identifying and interpreting statistically significant effects within the sample under study. While this framework is well-suited for hypothesis testing approaches, achieving the modern goal of precision medicine requires a different framework that is predictive in nature and that focuses on maximizing the predictive power of models and evaluating their ability to generalize beyond the data that were used to train them. However, few tools exist to support the development and evaluation of predictive models in the context of neuroimaging or lesion-behavior research, creating an obstacle to the widespread adoption of predictive modeling approaches in the field. Further, existing tools for lesion-behavior analysis are often unable to accommodate categorical outcome variables and often impose restrictions on the predictor data. Researchers therefore often must use different software packages and analytical approaches depending on whether they are addressing a classification vs. regression problem and on whether their predictor data correspond to binary lesion images, continuous lesion-network images, connectivity matrices, or other data modalities. To address these limitations, we have developed a MATLAB software toolkit that supports both inferential and predictive modeling frameworks, accommodates both classification and regression problems, and does not impose restrictions on the modality of the predictor data. The toolkit features both a graphical user interface and scripting interface, includes implementations of multiple mass-univariate, multivariate, and machine learning models, features built-in and customizable routines for hyper-parameter optimization, cross-validation, model stacking, and significance testing, and automatically generates text-based descriptions of key methodological details and modeling results to improve reproducibility and minimize errors in the reporting of methods and results. Here, we provide an overview and discussion of the toolkit features and demonstrate its functionality by applying it to the question of how expressive and receptive language impairments relate to lesion location, structural disconnection, and functional network disruption in a large sample of patients with left hemispheric brain lesions. We find that impairments in expressive vs. receptive language are most strongly associated with left lateral prefrontal and left posterior temporal/parietal damage, respectively. We also find that impairments in expressive vs. receptive language are associated with partially overlapping patterns of fronto-temporal structural disconnection, and that the associated functional networks are also similar. Importantly, we find that lesion location and lesion-derived network measures are highly predictive of both types of impairment, with predictions from models trained on these measures explaining ~30-40% of the variance on average when applied to data from patients not used to train the models. We have made the toolkit publicly available, and we have included a comprehensive set of tutorial notebooks to support new users in applying the toolkit in their studies.
Read full abstract