Abstract

One important problem in musical information retrieval is automatic music transcription, which is an automated conversion process from played music to a symbolic notation such as MIDI file. Since the accuracy of previous audio-based transcription systems is not satisfactory, we propose an innovative computer vision-based automatic music transcription system named claVision to perform piano music transcription. Instead of processing the music audio, the system performs the transcription only from the video performance captured by a camera mounted over the piano keyboard. In this paper, we describe the architecture and the algorithms used in claVision. The claVision system has a high accuracy ( ${F_1}$ score over 0.95) and a very low latency (about 7.0 ms) in real-time music transcription, even under different illumination conditions. This technology can also be used for other musical keyboard instruments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call