Self-directed learning in new and changing environments: understanding human algorithms for exploration
In order to act, plan, and achieve goals, people must learn about their environment and the outcome of possible actions. One reason for human successes in developing new theories and strategies when confronted with new problems is that people are not passive observers. Indeed, children ask informative questions and can adapt their strategies when inquiring about things they don’t know. In this thesis, I aim to understand how people self-direct their learning across multiple tasks, when trying to achieve goals within their environment. Chapter 2 presents experiments designed to better understand people’s exploration and reward maximising strategies across a sequence of tasks. Through these experiments I examine how the environment, specifically the availability of information, and the prior knowledge of participants, affect their exploratory strategies. To study participant strategies I develop a general framework that considers both Bayesian approaches as well as a range of simpler heuristic strategies to approach the problem of goal-directed exploration (Chapter 3). This framework aims to explain the variation of participant strategies in terms of different underlying cognitive mechanisms that guide exploration. One of the benefits of a general framework is the ability to capture a diverse set of behaviours within a continuous parameter space. I focus on the problem of understanding the differences between participants by leveraging this shared psychological space. Specific families of strategies emerge from the behaviour of participants, highlighting the importance of studying individual di↵erences to better understand cognitive mechanisms (Chapter 4). In Chapter 5, I analyse the experimental data from Wu et al. (2018) that considers similar phenomena concerning human exploration, with a specific focus on people’s ability to generalise to guide their search. My analysis shows that our general framework offers a more compelling explanation for participant behaviour than the model they present, while again highlighting the importance of looking at individual differences. From these model based analysis we find that people are able to adapt to the structure of their environment, and are guided by local uncertainty rather than global uncertainty during exploration. Finally, Chapter 6 looks at participants’ behaviour when learning across a sequence of tasks when the underlying problem structures may change. How do people learn in a changing world? I show that the theory of inference by sampling can help explain distinct phenomena relating to the dynamics of learning across tasks. Our models are able to explain people’s ability to progress across tasks when they share structural similarities, their ability to adapt to change, but also specific contexts where participants are continuously unable to realise the world has changed.