Edinburgh Research Archive

Effects of context on semantic representations and mechanisms in humans and language models

Item Status

Embargo End Date

Authors

Carter, Georgia-Ann

Abstract

The rapid and incremental nature of language processing is a central challenge for human cognition. Understanding how this challenge is met has resulted in a broad range of work focused on answering questions such as elucidating which mechanisms support processing and exploring which computational models serve as good approximates for language processing. Typically, this work has focused on semantic processing across single words or sentences. However, we now know that context plays an important role in how the semantic system processes linguistic inputs. In this thesis, I investigate the influence of context on semantic representations and mechanisms in humans and language models. As much of the literature has focused on single sentence contexts, I investigate context at both wider and narrower scales. The first branch of studies focus on this wider scale by investigating the impact of discourse coherence on predictive processing in both humans and Large Language Models (LLMs). The second branch of studies focus on the narrower scale by exploring the influence of context on pre-trained word embeddings in a perceptual property prediction task for both nouns and adjective–noun phrases. In addition, I investigated how a neural network encodes perceptual features in conceptual combinations. In the first branch of work, I found that human’s lexical–semantic predictions are sensitive to discourse coherence, but especially so when semantic violations are present. From modelling, I found that LLMs are similarly sensitive to the relationship between context and a target sentence. This is in addition to coherence effects and their interaction with predictability, which suggests that the benefit of a highly coherent context extends beyond just lowering linguistic surprisal. In the second branch of work, I found reasonable performance for the perceptual prediction of the shape of a concept from word embeddings, but lower performance for the brightness of a concept. This was not impacted by contextual prompting for noun representations, though I did find a limited impact of context when predicting the brightness of adjective–noun pairs. This has implications for the interpretability of representations derived from language models and for debates on embodiment within human conceptual processing. In the final study, I found that neural networks can flexibly encode the modulation of conceptual features when nouns are modified with scalar adjectives. They do this by first learning to generate predictions based on the adjective, and then acquiring knowledge of how the adjective modulates particular nouns. In sum, this thesis adds greater depth to our understanding of how context influences language in humans and machines.

This item appears in the following Collection(s)