Vast amounts of publicly licensed classical music resources are housed within many different repositories on the Web encompassing richly diverse facets of information—including bibliographical and biographical data, digitized images of music notation, music score encodings, audiovisual performance recordings, derived feature data, scholarly commentaries, and listener reactions. While these varied perspectives ought to contribute to greater holistic understanding of the music objects under consideration, in practice, such repositories are typically minimally connected. The TROMPA project aims to improve this situation by interconnecting and enriching public-domain music repositories. This is achieved, on the one hand, by the application of automated, cutting-edge Music Information Retrieval techniques, and on the other, by the development of contribution mechanisms enabling users to integrate their expertise. Information within established repositories is interrelated with data generated by the project within a data infrastructure whose design is guided by the FAIR principles of data management and stewardship: making music information Findable, Accessible, Interoperable, and Reusable. We provide an overview of challenges of description, identification, representation, contribution, and reliability toward applying the FAIR principles to music information, and outline TROMPA's implementational approach to overcoming these challenges. This approach applies a graph-based data infrastructure to interrelate information hosted in different repositories on the Web within a unifying data model (a 'knowledge graph'). Connections are generated across different representations of music content beyond the catalogue level, for instance connecting note elements within score encodings to corresponding moments in performance time-lines. Contributions of user data are supported via privacy-first mechanisms that retain control of such data with the contributing user. Provenance information is captured throughout, supporting reproducibility and re-use of the data both within and outside the context of the project.
Read full abstract