Abstract

One important problem in musical information retrieval is automatic music transcription, which is an automated conversion process from played music to a symbolic notation such as MIDI file. Since the accuracy of previous audio-based transcription systems is not satisfactory, we propose an innovative computer vision-based automatic music transcription system named claVision to perform piano music transcription. Instead of processing the music audio, the system performs the transcription only from the video performance captured by a camera mounted over the piano keyboard. In this paper, we describe the architecture and the algorithms used in claVision. The claVision system has a high accuracy ( ${F_1}$ score over 0.95) and a very low latency (about 7.0 ms) in real-time music transcription, even under different illumination conditions. This technology can also be used for other musical keyboard instruments.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call