dc.contributor.advisor | Webb, Barbara | |
dc.contributor.advisor | Ramamoorthy, Subramanian | |
dc.contributor.author | Gkanias, Evripidis | |
dc.date.accessioned | 2023-03-28T15:36:45Z | |
dc.date.available | 2023-03-28T15:36:45Z | |
dc.date.issued | 2023-03-28 | |
dc.identifier.uri | https://hdl.handle.net/1842/40449 | |
dc.identifier.uri | http://dx.doi.org/10.7488/era/3217 | |
dc.description.abstract | Historically, reinforcement learning is a branch of machine learning founded on observations of how animals learn. This involved collaboration between the fields of biology and artificial intelligence that was beneficial to both fields, creating smarter artificial agents and improving the understanding of how biological systems function. The evolution of reinforcement learning during the past few years was rapid but substantially diverged from providing insights into how biological systems work, opening a gap between reinforcement learning and biology. In an attempt to close this gap, this thesis studied the insect neuroethology of reinforcement learning, that is, the neural circuits that underlie reinforcement-learning-related behaviours in insects. The goal was to extract a biologically plausible plasticity function from insect-neuronal data, use this to explain biological findings and compare it to more standard reinforcement
learning models. Consequently, a novel dopaminergic plasticity rule was developed to approximate the function of dopamine as the plasticity mechanism between neurons in the insect brain. This allowed a range of observed learning phenomena to happen in parallel, like memory depression, potentiation, recovery, and saturation. In addition, by using anatomical data of connections between neurons in the mushroom body neuropils of the insect brain, the neural incentive circuit of dopaminergic and output neurons was also explored. This, together with the dopaminergic plasticity rule, allowed for dynamic collaboration amongst parallel memory functions, such as acquisition, transfer, and forgetting. When tested on olfactory conditioning paradigms, the model reproduced the observed changes in the activity of the identified neurons in fruit flies. It also replicated the observed behaviour of the animals and it allowed for flexible behavioural control. Inspired by the visual navigation system of desert ants, the model was further challenged in the visual place recognition task. Although a relatively simple encoding of the olfactory information was sufficient to explain odour learning, a more sophisticated encoding of the visual input was required to increase the separability among the visual inputs and enable visual place recognition. Signal whitening and sparse combinatorial encoding were sufficient to boost the performance of the system in this task. The incentive circuit enabled the encoding of increasing familiarity along a known route, which dropped proportionally to the distance of the animal from that route. Finally, the proposed model was challenged in delayed reinforcement tasks, suggesting that it might take the role of an adaptive critic in the context of reinforcement learning. | en |
dc.contributor.sponsor | Engineering and Physical Sciences Research Council (EPSRC) | en |
dc.language.iso | en | en |
dc.publisher | The University of Edinburgh | en |
dc.relation.hasversion | E. Gkanias, B. Risse, M. Mangan, and B. Webb (2019). “From skylight input to behavioural output: a computational model of the insect polarised light compass”. PLoS Computational Biology, 15(7), e1007123 | en |
dc.relation.hasversion | E. Gkanias, L. Y. McCurdy, M. N. Nitabach, and B. Webb (2022). “An incentive circuit for memory dynamics in the mushroom body of Drosophila melanogaster”. Elife 11, e75611. | en |
dc.relation.hasversion | Schwarz, S., L. Clement, E. Gkanias, and A. Wystrach (2020). “How do backward-walking ants (Cataglyphis velox) cope with navigational uncertainty?” In: Animal Behaviour 164, pp. 133–142. issn: 0003-3472. doi: 10.1016/j.anbehav.2020.04. 006 | en |
dc.relation.hasversion | Stouraitis, T., E. Gkanias, J. M. Hemmi, and B. Webb (2017). “Predator evasion by a Robocrab.” In: 6th International Conference on Biomimetic and Biohybrid Systems. Vol. 10384. Stanford, CA: Springer, pp. 428–439. isbn: 978-3-319-63536-1. doi: 10. 1007/978-3-319-63537-8\_36 | en |
dc.subject | bioinspired | en |
dc.subject | computational neuroscience | en |
dc.subject | Drosophila melanogaster | en |
dc.subject | behaviour | en |
dc.subject | motivation | en |
dc.subject | memory | en |
dc.subject | navigation | en |
dc.subject | plasticity | en |
dc.subject | olfaction | en |
dc.subject | vision | en |
dc.title | Insect neuroethology of reinforcement learning | en |
dc.type | Thesis or Dissertation | en |
dc.type.qualificationlevel | Doctoral | en |
dc.type.qualificationname | PhD Doctor of Philosophy | en |