Abstract
When listening to a piece of music, listeners often identify distinct sections or segments
within the piece. Music segmentation is recognised as an important process in the abstraction
of musical contents and researchers have attempted to explain how listeners
perceive and identify the boundaries of these segments.
The present study seeks the development of a system that is capable of performing
melodic segmentation in an unsupervised way, by learning from non-annotated musical
data. Probabilistic learning methods have been widely used to acquire regularities in
large sets of data, with many successful applications in language and speech processing.
Some of these applications have found their counterparts in music research and have
been used for music prediction and generation, music retrieval or music analysis, but
seldom to model perceptual and cognitive aspects of music listening.
We present some preliminary experiments on melodic segmentation, which highlight
the importance of memory and the role of learning in music listening. These experiments
have motivated the development of a computational model for melodic segmentation
based on a probabilistic learning paradigm.
The model uses a Mixed-memory Markov Model to estimate sequence probabilities
from pitch and time-based parametric descriptions of melodic data. We follow the assumption
that listeners' perception of feature salience in melodies is strongly related
to expectation. Moreover, we conjecture that outstanding entropy variations of certain
melodic features coincide with segmentation boundaries as indicated by listeners.
Model segmentation predictions are compared with results of a listening study on
melodic segmentation carried out with real listeners. Overall results show that changes
in prediction entropy along the pieces exhibit significant correspondence with the listeners'
segmentation boundaries.
Although the model relies only on information theoretic principles to make predictions
on the location of segmentation boundaries, it was found that most predicted segments
can be matched with boundaries of groupings usually attributed to Gestalt rules.
These results question previous research supporting a separation between learningbased
and innate bottom-up processes of melodic grouping, and suggesting that some
of these latter processes can emerge from acquired regularities in melodic data.