Hidden Markov Model for Automatic Transcription of MIDI Signals
This paper describes a Hidden Markov Model (HMM)-based method of automatic transcription of MIDI (Musical Instrument Digital Interface) signals of performed music. The problem is formulated as recognition of a given sequence of fluctuating note durations to find the most likely intended note sequence utilizing the modern continuous speech recognition technique. Combining a stochastic model of deviating note durations and a stochastic grammar representing possible sequences of notes, the maximum likelihood estimate of the note sequence is searched in terms of Viterbi algorithm. The same principle is successfully applied to a joint problem of bar line allocation, time measure recognition, and tempo estimation. Finally, durations of consecutive n notes are combined to form a "rhythm vector" representing tempo-free relative durations of the notes and treated in the same framework. Significant improvements compared with conventional "quantization" techniques are shown.