Edinburgh Research Archive

Geometry for deep representation learning

dc.contributor.advisor
Storkey, Amos
dc.contributor.advisor
Williams, Christopher
dc.contributor.author
Khan, Mohammad Asif
dc.date.accessioned
2023-11-30T12:18:53Z
dc.date.available
2023-11-30T12:18:53Z
dc.date.issued
2023-11-30
dc.description.abstract
Deep representation learning has achieved remarkable success in discovering meaningful lowdimensional features from high-dimensional data in recent years. In datasets containing face images, these features can capture underlying factors of variations, such as age, eye colour, and hairstyle. We can employ learned representations for solving tasks such as face detection. By capturing these factors of variations, the representations aim to build a model of the real world, reflecting its inherent regularities. However, current approaches still face challenges when it comes to discovering complex regularities of the world in a data-efficient way, resulting in a lack of interpretability, robustness and limited generalisation. Recognising that real-world data spaces often exhibit regularities characterised by various symmetries that need appropriate modelling assumptions is crucial. Consider an image of an apple; we know its transformation under a translation operator will not change its identity as an apple. Such properties that do not change under a broad family of transformations are known as invariants. “Geometry is a study of invariants"– Felix Klein (Klein, 1872). In this thesis, we utilise geometry as a fundamental principle to account for relevant properties in learning representation space. Specifically, we propose novel methodologies to address three main challenges in deep representation learning: learning disentangled latent factors for image sequences, investigating the robustness of deep latent factor models to adversarial perturbations, and learning representations that account for hierarchical dependencies in heterophilic graphs. The first project focuses on learning to disentangle content and motion information into separate latent components for image sequences. Here, content refers to information shared across all frames, for example, the identity of an object undergoing the dynamics, and motion refers to information expressed in a given sequence frame. The temporal structure in image sequences traces a path in a higher dimensional data space that takes the form of a 1-dimensional manifold. A key challenge in learning representations from this data is designing a latent dynamical model that accounts for the temporal structure of image sequences. In this work, we utilise symplectic geometry in latent space for modelling the dynamics of various motions; this structure in latent space associates a motion with a constant energy term that captures the manifold of the dynamics of sequences. For a set of dynamical actions, we associate each with a unique subspace that reflects the energy preservation of a respective dynamical action. Our results demonstrate that we can disentangle factors of variations, facilitating tasks such as controlled generation and motion transfer. The second contribution proposes a robustness analysis of an oft-used representation learning framework, namely variational autoencoders (VAEs). It is vital that VAEs are built to be reliable, primarily for their real-world applications, such as latent space control in robotics or in a medical domain for designing novel molecules by exploring the latent space. We examine latent space from a geometric standpoint and establish a connection between the vulnerability of VAEs to adversarial perturbations and the structure of the latent space. Our findings show that the learned latent manifold has a high curvature with low/zero density regions, making VAEs susceptible to adversarial attacks. We propose quantitative scores for measuring robustness and a simple training mechanism for enhancing it. Lastly, we target the challenge of representation learning for data on graph domains with a heterophily property. In heterophilic graphs, the nodes not in an immediate vicinity may share the same label due to their similar local connectivity structure. For example, in an academic network, two researchers in different countries can exhibit similar local connectivity due to the nature of their profession. We use diffusion geometry to explicitly model hierarchical dependencies in a graph in the form of augmentations. We then use these augmentations in a contrastive setup for learning representations of nodes in a graph. These representations can facilitate various downstream tasks, including graph classification, link prediction, and community detection. Our results showcase the effectiveness of augmentations in allowing the encoder to capture hierarchical dependencies, demonstrated by improved performance on several benchmark datasets. In summary, through three core contributions, this thesis shows the importance of incorporating geometry-based inductive biases into deep representation learning models to develop efficient and reliable applications.
en
dc.identifier.uri
https://hdl.handle.net/1842/41243
dc.identifier.uri
http://dx.doi.org/10.7488/era/3979
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Khan, Asif, and Amos Storkey, "Hamiltonian Latent Operators for Content and Motion Disentanglement in Image Sequences". The Thirty-sixth Conference on Neural Information Processing Systems, 2022.
en
dc.relation.hasversion
Khan, Asif, and Amos Storkey, "Adversarial robustness of VAEs through the lens of local geometry."Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:8954-8967, 2023.
en
dc.relation.hasversion
Khan, Asif, and Amos Storkey, "Contrastive Learning for Non-local Graphs with Multi- Resolution Structural Views", arXiv preprint (2023)
en
dc.rights.license
Attribution-NonCommercial-ShareAlike 4.0 International
en
dc.rights.uri
http://creativecommons.org/licenses/by-nc-sa/4.0/
en
dc.subject
Geometry
en
dc.subject
deep representation learning
en
dc.subject
real-world data spaces
en
dc.subject
symplectic geometry
en
dc.subject
latent space
en
dc.subject
variational autoencoders (VAEs)
en
dc.title
Geometry for deep representation learning
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
KhanMA_2023.pdf
Size:
23.65 MB
Format:
Adobe Portable Document Format
Description:

This item appears in the following Collection(s)