Coordinating utterances during conversational dialogue: the role of content and timing predictions
Item statusRestricted Access
Embargo end date02/07/2020
During conversation, we take turns at talk and switch between listening to a speaker and producing an appropriate and timely response. In fact, we often do so with relatively little gap or overlap between our own and our partner’s contribution. Some theories argue that we manage this process by predicting what we are going to hear. For example, if a speaker says I would like to go outside to fly a…, then the listener may predict that the speaker’s next word will likely be kite. However, little is known about how these predictions aid coordination during conversational dialogue. In particular, how does prediction help listeners comprehend the speaker’s turn, prepare a response (i.e., decide what they want to say), and time its articulation (i.e., decide when they want to say it)? And to what extent are these processes interwoven? This thesis firstly addressed this issue by presenting participants with questions in which they either could (e.g., Are dogs your favorite animal?) or could not (e.g., Would you like to go to the supermarket?) predict the speaker’s final word. We asked them to either complete a button-pressing task (Experiments 1 and 3), in which they indicated when they thought the speaker would reach the end of their utterance, or a question-answering task (Experiment 2 and 4), in which they verbally answered each question either yes or no. We found that listeners responded earlier in the question-answering task when the final word(s) of the question were predictable rather than unpredictable. However, we found no effects of content or length predictability on the precision (i.e., how closely participants responded to the speaker’s turn-end) of participants’ button-presses or verbal responses. Thus, the results of Experiments 1-4 suggest that listeners use content predictions to prepare a response, but not to predict turn-endings. In other words, preparation and articulation relied on different mechanisms. Experiments 5 and 6 also used a question-answering task and provided further support for this conclusion. In particular, we manipulated the speech rate of the context (e.g., Do you have a…) and the final word (e.g., dog?) of questions using time-compression, so that each component was spoken at the natural rate or twice as fast. We found that participants responded earlier when context was speeded rather than natural, suggesting they entrained to the speaker’s context rate, which in turn influenced when they launched articulation. We also found that listeners responded earlier when the speaker’s final word (consisting of a single syllable) was speeded rather than natural, regardless of context rate, suggesting they updated their entrainment after encountering a single syllable at a different rate. In Experiment 6, this final word effect occurred regardless of whether the speaker’s final word was predictable or unpredictable, suggesting that speech rate entrainment was used to time articulation independently from preparing the content of a response. Finally, since response preparation and timing articulation rests on successfully comprehending the speaker’s turn, Experiments 7-9 investigated how prediction helps listeners understand distorted speech by presenting participants with question-answer sequences, in which the answer was distorted. Results suggested that comprehension of the distorted answer was sensitive to the plausibility of the answer, rather than the predictability of the question, suggesting that understanding distorted speech is driven by ease of integration but not prediction. Together, these studies provide insight into the role that prediction plays in comprehension, response preparation, and articulation.