We present the development of a robotic percussionist named Haile that is designed to demonstrate musicianship. We define robotic musicianship in this context as a combination of musical, perceptual, and interaction skills with the capacity to produce rich acoustic responses in a physical and visual manner. Haile listens to live human players, analyzes perceptual aspects of their playing in real time, and uses the product of this analysis to play along in a collaborative and improvisatory manner. It is designed to combine the benefits of computational power, perceptual modeling, and algorithmic music with the richness, visual interactivity, and expression of acoustic playing. We believe that combining machine listening, improvisational algorithms, and mechanical operations with human creativity and expression can lead to novel musical experiences and outcome. Haile can therefore serve as a test bed for novel forms of musical humanmachine interaction, bringing perceptual aspects of computer music into the physical world both visually and acoustically. This article presents our goals for the project and the approaches we took in design, mechanics, perception, and interaction to address these goals. After an overview of related work in musical robotics, machine musicianship, and music perception, we describe Haile's design, the development of two robotic arms that can strike different locations on a drum with controllable volume levels, applications developed for lowand high-level perceptual listening and improvisation, and two interactive compositions for humans and a robotic percussionist that use Haile's capabilities. We conclude with a description of a user study that was conducted in an effort to evaluate Haile's perceptual, mechanical, and interaction functionalities. The results of the study showed significant correlation between humans' and Haile's rhythmic perception as well as strong user satisfaction from Haile's perceptual and mechanical capabilities. The study also indicated areas for improvement, such as the need for better timbre and loudness control as well as more advanced and responsive interaction schemes.
Read full abstract