Complex sequential behaviors, such as speaking or playing music, entail flexible rule-based chaining of single acts. However, it remains unclear how the brain translates abstract structural rules into movements. We combined music production with multimodal neuroimaging to dissociate high-level structural and low-level motor planning. Pianists played novel musical chord sequences on a muted MR-compatible piano by imitating a model hand on screen. Chord sequences were manipulated in terms of musical harmony and context length to assess structural planning, and in terms of fingers used for playing to assess motor planning. A model of probabilistic sequence processing confirmed temporally extended dependencies between chords, as opposed to local dependencies between movements. Violations of structural plans activated the left inferior frontal and middle temporal gyrus, and the fractional anisotropy of the ventral pathway connecting these two regions positively predicted behavioral measures of structural planning. A bilateral frontoparietal network was instead activated by violations of motor plans. Both structural and motor networks converged in lateral prefrontal cortex, with anterior regions contributing to musical structure building, and posterior areas to movement planning. These results establish a promising approach to study sequence production at different levels of action representation.