Distinct behaviors imply distinct neural processes. Goal-directed and habitual control have been repeatedly shown to have starkly different behavioral signatures and neural substrates. The distinction critically hinges on the necessary underlying representations: stimulus–response (S–R) associations subserve habits, while response–outcome (R–O) associations allow flexible, goal-directed control. But another separation remains relatively underexplored: that between acquisition and expression. Are the structures that learn S–R and R–O representations also responsible for acting on them? Lesion studies in animal models have fractionated cortical and striatal subregions based on separable contributions to learning and performance (Ostlund & Balleine, 2005; Atallah et al., 2007). However, while these studies offer temporal specificity and causal manipulation, they have necessarily been restricted to select brain regions. Studies employing lesions offer temporal specificity and causal manipulation, and have been conducted in many brain regions (for a review, see Balleine & O’Doherty, 2010), but suffer from logical limitations: the finding that a lesion disables expression does not exclude a structure’s role in acquisition. Previous human functional magnetic resonance imaging (fMRI) studies have examined goal-directed and habitual control systems across the whole brain, but were not designed to examine the acquisition/ expression distinction (Tricomi et al., 2009; Gl€ascher et al., 2010; Liljeholm et al., 2012). A new study by Liljeholm et al. (2015) attempts to fill this gap by studying human behavior and neural activity in a novel behavioral task while undergoing fMRI. Their task addresses a key confound in the typical practice of studying habit acquisition through overtraining (extensive experience with a given S–R pairing): overtraining conflates acquisition with the performance improvements that come with practice. Instead, the conditions in this experiment were designed to produce habits or goal-directed behaviors with minimal training. As expected based on the previous literature, following training habits proved less sensitive to a change in outcome contingencies in the task, referred to as devaluation. This design allowed the authors to distinguish neural correlates of S–R/R–O representations ‘early’ vs. ‘late’ in their development (putatively capturing acquisition vs. expression). Using this task, the authors were able to replicate and extend many previous findings. Specifically, correlates of habit acquisition were found in the posterior caudate and cerebellum; whereas the former is to be expected, the latter is rarely reported in these types of studies. Similarly, correlates of habit expression in the ventral striatum and subgenual anterior cingulate cortex (ACC) both affirmed established knowledge (in the former case) and offered tantalizing clues to new functional relationships (in the latter case). Ventral striatal involvement accords with previous reports of its centrality to habitual control, but the role of subgenual ACC in this process remains a matter of much debate. Here, the authors suggest it as a functional homolog to rodent infralimbic cortex, a structure that has been implicated in both learning and expression of habits (Smith et al., 2012; Smith & Graybiel, 2013). Indeed, the question of cross-species homology remains a persistent source of tension in the instrumental control literature. The authors found that putamen activity decreased across the experiment, interpreted as suggesting a role in acquisition rather than expression. This appears to conflict with reports from perturbations of the rodent homolog, dorsolateral striatum (DLS; Furlong et al., 2014). Both results are, however, consistent with the observation that DLS representations ‘sharpen’ with training (Smith & Graybiel, 2013) – and thus fMRI measures of population activity might decrease. An important question for future research is whether divergent results across species arise from interpretational nuances of different methodologies, or actual differences in the functional anatomy. As this design is novel in the field, further work is needed to confirm that it faithfully distinguishes habitual and goal-directed behavior. One potential concern is that the current task may generate these behaviors by biasing attention towards or away from devaluation-relevant stimuli rather than by distinctly training S–R vs. R–O associations. Moreover, unlike classical manifestations of habitual behavior in the devaluation literature, habits in the current study were not necessarily maladaptive: giving the incorrect (devalued) response required completing a three-button sequence, sequences that were begun but rarely completed when subjects performed a (devaluation-insensitive) habit. Whether these features had bearing on the desired distinction between habits and goal-directed actions can be tested in future studies, for instance by using shorter response sequences and decoupling the focus of attention from the devalued target. Nevertheless, that these conditions distinguished behaviors and brain networks largely along expected lines should be seen as promising, if not yet conclusive, evidence that the manipulation was successful. Finally, though this study begins to clarify a previously muddled distinction, it also hints at new divisions yet to be substantiated. In the computational reinforcement learning literature, the question arises whether goal-directed and habit systems operate largely in parallel, or