In his paper What is entrainment? Definition and applications in musical (this issue), M. Clayton offers a definition of that is based on a timing dimension (relative phase relationships). However, this definition may be too limited when applied to musical entrainment. Based on the idea that human engagement with music is embodied and that gestures may condition entrainment, I suggest that the definition of be broadened so as to include a spatiotemporal dimension. Submitted 2012 January 6; accepted 2012 July 13. THE paper by Martin Clayton (What is entrainment? Definition and applications in musical research, this issue) elucidates the notion of in a more general sense. He discusses the more distinctive features of in music and provides some examples of research about in music. Basically, I agree with the suggestion that has both an objective and subjective component. The objective component is reflected in the statement that the evidence for will be (a) a stabilization of the relative phase relationship, and (b) the reassertion of this stability following a perturbation. He also states that entrainment does not necessarily result in synchronization in phase between rhythms of matching periods, and that can involve matching periods as well as hierarchical and polyrhythmic relationships, it is out of phase as often as it is in phase; and it can fall almost anywhere on the symmetrical-asymmetrical continuum. The latter two statements fully support the idea of relative phase relationships, and they fit rather well with the idea that is a timing issue that can be measured and modeled. As for the subjective component, reference has been made to the role dynamic attention has as a contributing factor to in social interaction. Basically, I share the viewpoint that is a highly interesting phenomenon for understanding how humans interact with music. I agree that it is possible to measure important aspects of and that case studies are important in order to get an overview of the highly diverse ways in which interferes with music making. Indeed, more case studies are needed because musical occurs at different levels and in different forms. Furthermore, I am much in favor of developing an experimental approach towards understanding musical entrainment, in addition to field observations and mathematical modeling. I also believe that the study of can further profit from approaches that focus more on subjective components, such as experiences, intentions in entrainment, and last but not least, the role of gestures and corporeal articulations in relation to entrainment. In what follows, I raise some questions about this, which illustrate the need for a broader definition of the concept of entrainment. MUSICAL ENTRAINMENT INVOLVES ACTION Martin Clayton seems to be aware of the need for a broader definition of when he states that metrical patterns may emerge directly from joint action, rather than necessarily coming into existence first at the neuronal level and then being expressed behaviorally. I found this an intriguing statement, but I didn't fully grasp what Martin wanted to say. Does he mean that may be more than just a timing issue? Does he mean that human is rooted in action, in the spatial deployment of the body? Does he mean that may involve a gestural component that interacts with the time component? Unfortunately, that aspect is less well developed in the paper, although I believe that it is an essential aspect in the study of (Leman, 2007).