In the present study, we investigate the supervised problem of composer classification. From a set of compositions and a set of composers, we seek to assign each composition to the correct composer using machine learning and natural language processing techniques. Our objective focused on using the n-gram technique to create vector representations of musical compositions and classify them using the Support Vector Machines (SVM) classifier on a term-frequency matrix composed of the vectors of the compositions. Our representation takes into account melodic relationships between instruments in polyphonic pieces. We extract n-grams in melodic direction, allowing us to go from one instrument to another in the process, which aims to generate more robust n-grams and a greater quantity of occurrences of n-grams. We evaluate different classification models using feature filtering and varying hyperparameters such as the TF-IDF formula, among others. We test our method on a dataset made of string quartets by composers Haydn and Mozart, achieving results that improves upon previous state-of-the-art results.
Read full abstract