The ability to follow the gaze of conspecifics is a critical component in the development of social behaviors, and many efforts have been directed to studying the earliest age at which it begins to develop in infants. Developmental and neurophysiological studies suggest that imitative learning takes place once gaze-following abilities are fully established and joint attention can support the shared behavior required by imitation. Accordingly, gaze-following acquisition should be precursory to most machine learning tasks, and imitation learning can be seen as the earliest modality for acquiring meaningful gaze shifts and for understanding the structural substrate of fixations. Indeed, if some early attentional process, based on a suitable combination of gaze shifts and fixations, could be learned by the robot, then several demonstration learning tasks would be dramatically simplified. In this paper, we describe a methodology for learning gaze shifts based on imitation of gaze following with a gaze machine, which we purposefully introduced to make the robot gaze imitation conspicuous. The machine allows the robot to share and imitate gaze shifts and fixations of a caregiver through a mutual vergence. This process is then suitably generalized by learning both the scene salient features toward which the gaze is directed and the way saccadic programming is attained. Salient features are modeled by a family of Gaussian mixtures. These together with learned transitions are generalized via hidden Markov models to account for humanlike gaze shifts allowing to discriminate salient locations.