Information structure in mappings: an approach to learning, representation and generalisation

Conklin, Henry Coxe

Information structure in mappings: an approach to learning, representation and generalisation

Simple item page

dc.contributor.advisor

Smith, Kenny

dc.contributor.advisor

Titov, Ivan

dc.contributor.author

Conklin, Henry Coxe

dc.date.accessioned

2025-06-30T12:34:59Z

dc.date.available

2025-06-30T12:34:59Z

dc.date.issued

2025-06-30

dc.description.abstract

Mappings relate two different spaces, transforming things of one kind into another; they are ubiquitous across the sciences and the world around us. Mathematical functions map between a domain and range, digital phone systems map waveforms to binaries, ribosomes map DNA sequences to proteins as part of a larger mapping between genotypes and phenotypes. Telegram operators map back and forth between text and morse code, artificial neural networks map inputs to vector representations, and language allows us to map our thoughts to sentences that express them. The structure of these mappings differs widely, having conformed either to the selection pressures of their environment or the concerns of their architects. Despite the remarkable success of large large-scale neural networks in recent years, we still lack unified notation for thinking about and describing their representational spaces. We lack methods to reliably describe how their representations are structured, how that structure emerges over training, and what kinds of structures are desirable. This thesis introduces quantitative methods for identifying systematic structure in mappings between spaces, and leverages them to understand how deep-learning models learn to represent information, what representational structures drive generalisation, and how design decisions condition the structures that emerge. To do this I identify basic kinds of system-level structures present in a mapping, along with information theoretic quantifications of each of them. I use these to analyse learning, structure, and generalisation across multi-agent reinforcement learning models, sequence-to-sequence models trained on a single task, models trained with meta-learning objectives, and Large Language Models. I also introduce a novel, performant, approach to estimating the entropy of vector space, that allows this analysis to be applied to models ranging in size from 1 million to 12 billion parameters. The experiments here work to shed light on how large-scale distributed models of cognition learn, while allowing us to draw parallels between those systems and their human analogs. They show how the structures of language and the constraints that give rise to them in many ways parallel the kinds of structures that drive performance of contemporary neural networks.

en

dc.identifier.uri

https://hdl.handle.net/1842/43633

dc.identifier.uri

http://dx.doi.org/10.7488/era/6166

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Conklin, H., & Smith, K. (2022). Compositionality with variation reliably emerges in neural networks. The Eleventh International Conference on Learning Representations.

en

dc.relation.hasversion

Conklin, H., & Smith, K. (2024). Representations as language: An information-theoretic framework for interpretability. arXiv preprint arXiv:2406.02449.

en

dc.relation.hasversion

Conklin, H., Wang, B., Smith, K., & Titov, I. (2021). Meta-Learning to Compositionally Generalize [arXiv: 2106.04252]. arXiv:2106.04252 [cs]. Retrieved May 4, 2022, from http://arxiv.org/abs/2106.04252

en

dc.rights.license

CC BY 4.0 Attribution 4.0 International Deed

en

dc.rights.uri

https://creativecommons.org/licenses/by/4.0/

en

dc.subject

artificial intelligence

en

dc.subject

artificial neural networks

en

dc.subject

mapping structure

en

dc.subject

mappings

en

dc.subject

deep-learning models

en

dc.subject

design decisions

en

dc.subject

multi-agent reinforcement learning models

en

dc.subject

sequence-to-sequence models

en

dc.subject

meta-learning objectives

en

dc.title

Information structure in mappings: an approach to learning, representation and generalisation

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Conklin2025.pdf
Size:: 30.13 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection