Abstract

Music generation has generally been focused on either creating scores or interpreting them. We discuss differences between these two problems and propose that, in fact, it may be valuable to work in the space of direct performance generation: jointly predicting the notes and also their expressive timing and dynamics. We consider the significance and qualities of the dataset needed for this. Having identified both a problem domain and characteristics of an appropriate dataset, we show an LSTM-based recurrent network model that subjectively performs quite well on this task. Critically, we provide generated examples. We also include feedback from professional composers and musicians about some of these examples.

Highlights

  • Recognizing that ‘‘talking about music is like dancing about architecture’’,1 we kindly ask the reader to listen to the linked audio in order to effectively understand the motivation, data, results, and conclusions of this paper

  • We discuss differences between these two problems and propose that, it may be valuable to work in the space of direct performance generation: jointly predicting the notes and their expressive timing and dynamics

  • We consider the significance and qualities of the dataset needed for this. Having identified both a problem domain and characteristics of an appropriate dataset, we show an LSTM-based recurrent network model that subjectively performs quite well on this task

Read more

Summary

Introduction

Recognizing that ‘‘talking about music is like dancing about architecture’’,1 we kindly ask the reader to listen to the linked audio in order to effectively understand the motivation, data, results, and conclusions of this paper. As this research is about producing music, we believe the actual results are most effectively perceived— only perceived—in the audio domain. This will provide necessary context for the verbal descriptions in the rest of the paper. The first two keywords in the title are time and feeling: not coincidentally, our central thesis is that, given the current state of the art in music generation systems, it is effective to generate the expressive timing and dynamics information concurrently with the music We do this by directly generating improvised performances rather than creating or interpreting scores. We begin with an exposition of some relevant musical concepts

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.