Quantifying the perceptual value of lexical and non-lexical channels of spoken dialogue
dc.contributor.advisor
Lai, Catherine
dc.contributor.advisor
Bell, Peter
dc.contributor.author
Wallbridge, Sarenne
dc.date.accessioned
2025-03-12T13:17:32Z
dc.date.available
2025-03-12T13:17:32Z
dc.date.issued
2025-03-12
dc.description.abstract
Spoken conversation is one of the most widely-used means of communication that exists. Still, the cognitive mechanisms we employ to decode information from the speech
signal are not well understood. Predictive theories of language processing, which explain language comprehension as a function of the alignment between our predictions
about the upcoming signal and its actual realization, have been widely adopted across
the fields of psycholinguistics, cognitive science, and linguistics. However, empirical
support for such theories stems primarily from written language comprehension.
This thesis is a study of how predictive processing may be involved in the comprehension of spoken dialogue. We focus on two features that distinguish spoken
interaction from written sentences and how these features may alter comprehension
behaviours. First, speech is a much richer signal than text; information can be communicated through both the lexical channel of which words are said, and the non-lexical
channel of how those words are spoken. We examine how these channels contribute,
both independently and jointly, to predictions about upcoming turns in dialogue. Second, making predictions in spoken dialogue involves reasoning about abstract aspects
of language such as pragmatics that may not be evident in written sentences. As
such, we shift focus away from the content of predictions to instead investigate their
genesis—specifically, how do the lexical and non-lexical channels constrain the shape
of expectations regarding the upcoming signal? To answer this question, we propose
Perceptual Information Value which quantifies the value of a channel in terms of how it
alters expectations, as well as a behavioural paradigm to measure it in spoken dialogue.
We begin by investigating predictions of the lexical content of dialogue. Our results show that humans generate expectations at the level of dialogue turns and that
their predictions contain inherent variability. We argue that this variability is important component of how people process more realistic forms of language-use, such as
spoken dialogue. Leveraging large language models as purely predictive processing
mechanisms, we demonstrate a degree of alignment between human and model predictions; however, it is highly sensitive to the model architecture and training objective. In particular, model predictions do not contain the variability that we observe in
human predictions. Next, we use this foundation to study how predictions manifest in
spoken dialogue where messages can be distributed across both lexical and non-lexical
channels. Our experiments show that expectations about spoken dialogue turns are
a function of both lexical and non-lexical channels, as well as their joint expression.
Importantly, we find that access to channels can constrain expectations meaningfully,
even if it yields less accurate predictions. We therefore argue that the perceptual value
of information in a channel lies not in its effect on predictive accuracy but more generally in its capacity to shape expectations.
en
dc.identifier.uri
https://hdl.handle.net/1842/43197
dc.identifier.uri
http://dx.doi.org/10.7488/era/5738
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Wallbridge, S., Bell P., and Lai, C. (2022) Investigating perception of spoken dialogue acceptability through surprisal. In Proceedings of Interspeech, (4506-4510)
en
dc.relation.hasversion
Wallbridge, S., Bell P., and Lai, C. (2023) Do dialogue representations align with perception? An empirical study. Long paper in Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (2696–2713)
en
dc.relation.hasversion
Wallbridge, S., Bell P., and Lai, C. (2021) It’s not what you said, it’s how you said it: discriminative perception of speech as a multi-channel communication system. In Proceedings of Interspeech, (2386–2390)
en
dc.relation.hasversion
Wallbridge, S., Bell P., and Lai, C. (2023) Quantifying the perceptual value of lexical and non-lexical channels in speech. In Proceedings of Interspeech (2708–2712)
en
dc.relation.hasversion
Adigwe, A., Wallbridge, S., and King, S. (2024). What makes conversational speech: Investigating acoustic-prosodic cues in speech perception. In Proceedings of Interspeech
en
dc.relation.hasversion
Giulianelli, M., Wallbridge, S., and Fernandez, R. (2023). Information value: Measuring utterance predictability as distance from plausible alternatives. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5633–5653.
en
dc.subject
lexical channel
en
dc.subject
non-lexical channel
en
dc.subject
dialogue prediction
en
dc.subject
predictive processing
en
dc.subject
Perceptual Information Value
en
dc.subject
large language models
en
dc.title
Quantifying the perceptual value of lexical and non-lexical channels of spoken dialogue
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Wallbridge2025.pdf
- Size:
- 4.67 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

