Abstract
This paper presents an object lesson in the challenges and considerations involved in assembling a musical corpus for empirical research. It develops a model for the construction of a representative corpus of classical music of the “common practice period” (1700-1900), using both specific composers as well as broader historical styles and musical genres (e.g., symphony, chamber music, songs, operas) as its sampling parameters. Five sources were used in the construction of the model: (a) The Oxford History of Western Music by Richard Taruskin (2005), (b) amalgamated Orchestral Repertoire Reports for the years 2000-2007, from the League of American Orchestras, (c) a list of titles from the Naxos.com “Music in the Movies” web-based library, (d) Barlow and Morgenstern’s Dictionary of Musical Themes (1948), and (e) for the composers listed in sources (a)-(d), counts of the number of recordings each has available from Amazon.com. General considerations for these sources are discussed, and specific aspects of each source are then detailed. Intersource agreement is assessed, showing strong consensus among all sources, save for the Taruskin History. Using the Amazon.com data to determine weighting factors for each parameter, a preliminary sampling model is proposed. Including adequate genre representation leads to a corpus of ≈300 pieces, suggestive of the minimum size for an adequately representative corpus of classical music. The approaches detailed here may be applied to more specialized contexts, such as the music of a particular geographic region, historical era, or genre.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.