Language of music: a computational model of music interpretation
dc.contributor.advisor
Steedman, Mark
en
dc.contributor.advisor
King, Simon
en
dc.contributor.author
McLeod, Andrew Philip
en
dc.date.accessioned
2018-07-19T12:04:52Z
dc.date.available
2018-07-19T12:04:52Z
dc.date.issued
2018-07-02
dc.description.abstract
Automatic music transcription (AMT) is commonly defined as the process of converting
an acoustic musical signal into some form of musical notation, and can be split
into two separate phases: (1) multi-pitch detection, the conversion of an audio signal
into a time-frequency representation similar to a MIDI file; and (2) converting from
this time-frequency representation into a musical score. A substantial amount of AMT
research in recent years has concentrated on multi-pitch detection, and yet, in the case
of the transcription of polyphonic music, there has been little progress.
There are many potential reasons for this slow progress, but this thesis concentrates
on the (lack of) use of music language models during the transcription process. In particular,
a music language model would impart to a transcription system the background
knowledge of music theory upon which a human transcriber relies. In the related field
of automatic speech recognition, it has been shown that the use of a language model
drawn from the field of natural language processing (NLP) is an essential component
of a system for transcribing spoken word into text, and there is no reason to believe
that music should be any different.
This thesis will show that a music language model inspired by NLP techniques can
be used successfully for transcription. In fact, this thesis will create the blueprint for
such a music language model. We begin with a brief overview of existing multi-pitch
detection systems, in particular noting four key properties which any music language
model should have to be useful for integration into a joint system for AMT: it should
(1) be probabilistic, (2) not use any data a priori, (3) be able to run on live performance
data, and (4) be incremental.
We then investigate voice separation, creating a model which achieves state-of-the-art
performance on the task, and show that, used as a simple music language model, it
improves multi-pitch detection performance significantly. This is followed by an investigation
of metrical detection and alignment, where we introduce a grammar crafted for
the task which, combined with a beat-tracking model, achieves state-of-the-art results
on metrical alignment. This system’s success adds more evidence to the long-existing
hypothesis that music and language consist of extremely similar structures.
We end by investigating the joint analysis of music, in particular showing that a
combination of our two models running jointly outperforms each running independently.
We also introduce a new joint, automatic, quantitative metric for the complete
transcription of an audio recording into an annotated musical score, something which
the field currently lacks.
en
dc.identifier.uri
http://hdl.handle.net/1842/31371
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
McLeod, A., Schramm, R., Steedman, M., & Benetos, E. (2017). Automatic Transcription of Polyphonic Vocal Music. Applied Sciences, 7(12).
en
dc.relation.hasversion
McLeod, A., & Steedman, M. (2016, January). HMM-based voice separation of MIDI performance. Journal of New Music Research, 45(1), 17–26.
en
dc.relation.hasversion
McLeod, A., & Steedman, M. (2017). Meter detection in symbolic music using a lexicalized PCFG. In SMC (pp. 373–379).
en
dc.relation.hasversion
McLeod, A., & Steedman, M. (2018). Evaluating automatic polyphonic music transcription. In ISMIR.
en
dc.relation.hasversion
McLeod, A., & Steedman, M. (2018). Meter detection and alignment of MIDI performance. In ISMIR.
en
dc.relation.hasversion
Schramm, R., McLeod, A., Steedman, M., & Benetos, E. (2017). Multi-pitch detection and voice assignment for a cappella recordings of multiple singers. In ISMIR (pp. 552–559).
en
dc.subject
music information retrieval
en
dc.subject
automatic music transcription
en
dc.subject
music language modelling
en
dc.title
Language of music: a computational model of music interpretation
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- McLeod2018.pdf
- Size:
- 1.41 MB
- Format:
- Adobe Portable Document Format
This item appears in the following Collection(s)

