Relationship between disfluencies, associations, and inferences in speech comprehension
Item Status
Embargo End Date
Date
Authors
Badaya, Esperanza R.
Abstract
When producing speech spontaneously, speakers do more than produce words: They hesitate, correct themselves, and fill their pauses with um and er - a range of phenomena referred to as disfluencies. In turn, the presence of these disfluencies can affect speech comprehension: Filled pauses, such (i.e., um or uh), have been widely attested to affect how listeners process and interpret speech. For example, the presence of a filled pause has been shown to bias comprehenders' expectations of what will follow them (e.g., discourse-new entities, Arnold et al., 2004; hard-to-describe objects, Arnold et al., 2007) and their evaluations of both the speaker and the message (e.g., uncertainty, Brennan & Williams, 1995; deception, Arciuli et al., 2010).
This thesis investigates whether the online processing of disfluent speech and the interpretation of meaning can be accounted for by similar mechanisms. Previous research has shown that filled pauses are produced in predictable patterns and that their production offers a window into the speaker’s mental state. Consequently, the biases exerted by disfluencies can be accounted for by comprehenders’ passive learning of the distribution of filled pauses, and by a form of social reasoning about the causes for the speaker to experience trouble in speech production. These two mechanisms have contrasting predictions regarding the flexibility and the costs associated with comprehending disfluent speech. We took prediction of upcoming lexical items and interpretation of deceit as the test bed for these questions. The series of investigations reported here took a novel approach by comparing language comprehension in first and second language when speech is produced by first- or second-language speakers.
Part I explores a proposed process for efficient speech comprehension: prediction. In Experiments 1 and 2, we replicated and extended Bosker et al. (2014) eye-tracking studies wherein native listeners displayed anticipatory eye movements towards low-frequency items upon encountering native, but not non-native, disfluencies. In two experiments, we explored whether the presence of a filled pause led native and non-native listeners to anticipate a low-frequency word and whether this was dependent on the speaker’s identity, i.e. if it was a native or a non-native speaker. We found clear effects in a time window of analysis reflecting word recognition. For native comprehenders, the presence of a disfluency aided the recognition of a low-frequency word, regardless of the speaker’s linguistic background. In contrast, a disfluency produced by a native speaker increased the recognition of a low-frequency word in non-native listeners, while filled pauses produced by a non-native speaker benefited the recognition of both high- and low-frequency words. These results suggest that the addition of time without propositional content is a likely, but not a sufficient, explanation for the benefits due to the presence of disfluencies. Instead, comprehension of elements accompanying disfluencies is particularly beneficial when these items contextually co-occur with disfluencies.
Part II of this thesis explores how comprehenders interpret disfluent utterances. In Experiments 3 and 4 we investigated how listeners interpret disfluent speech as deceitful as a function of the speaker’s and the listener’s linguistic background. We replicated and extended Loy et al.’s (2017) eye-tracking studies in which native listeners were more likely to interpret disfluent utterances as deceitful, which was reflected in an early bias in their eye movements. Across two eye-tracking experiments, we found that utterances containing a filled pause were more likely to be interpreted as deceptive. Importantly, the emergence of this bias occurred early in the time course of comprehension, as evidenced by eye movements. Further, listeners were insensitive to the presence of alternative causes for the speaker to be disfluent (e.g., producing speech in their second language), nor did the task demands (e.g., comprehending speech in their second language) impact the emergence of this disfluency-as-deception bias. The speed with which the effect emerged, alongside its invariance, suggests that the bias has its roots in comprehenders’ stereotypes about the sound of deceit, i.e., an association between disfluency and deception.
Overall, the findings of this experimental work support an account where the effects of filled pauses are better conceptualised as a routine. Experience with language creates a `heuristic’ whereby filled pauses are contextually associated with language production difficulties, which constrains processing in a relatively cost-free manner. The emergence of this heuristic may fall under a general capacity of comprehenders to monitor their interlocutor (e.g., epistemic vigilance) which evaluates both their competence and their honesty. Further studies should explore the combination of other verbal and non-verbal cues associated with speaker confidence to investigate whether comprehension of filled pauses is indeed a reflection of the routinisation of social cognition in language comprehension.
This item appears in the following Collection(s)

