Meta learning for supervised and unsupervised few-shot learning

Antoniou, Antreas

Meta learning for supervised and unsupervised few-shot learning

Files

Antoniou2021.pdf (1.23 MB)

Date

2021-11-25

Authors

Antoniou, Antreas

Full item page

Abstract

Meta-learning or learning-to-learn involves automatically learning training-algorithms such that models trained with such learnt algorithms can solve a number of tasks while demonstrating high performance in a number of predefined objectives. Meta-learning has been used successfully in a large variety of application areas. In this thesis, we present three meta-learning-based methods that can very effectively tackle the problem of few-shot learning, where a model needs to be learned with only a very small amount of available training samples for each concept to be learned (e.g. 1-5 samples for each concept). Two of the proposed methods are trained using supervised learning, and another involves using self-supervised learning to achieve unsupervised meta-learning. The proposed methods build on the Model Agnostic Meta-Learning (MAML) framework. MAML is a gradient-based meta-learning method, where one learns a parameter initialization for a model, such that after the model has been updated a number of times towards a small training set (i.e. a support set), it can perform very well on previously unseen instances of the classes it was trained on, usually referred to as a small validation (i.e. a target set). The initialization is learned by utilizing two levels of learning. One inner-most level where the initialization is updated towards the support set and evaluated on the target set, thus generating an objective directly quantifying the generalization performance of the inner-loop model (i.e. the target-set loss), and an outer-most level where the parameter-initialization is learned using the gradients with respect to the target-set loss. Our first method, referred to as MAML++, improves the highly unstable MAML method using a number of modifications in the batch normalization layers, the outer loop loss formulation and the formulation of the learning scheduler used on the inner loop. Not only does MAML++ enable MAML to converge more reliably and efficiently, but it also improves the model’s generalization performance. We evaluate our method on Omniglot and Mini-ImageNet, where our model showcases vastly improved convergence speed, stability and generalization performance. The second proposed method, referred to as Self-Critique and Adapt (SCA), builds on MAML++ by allowing the inner loop model to adapt itself on the unsupervised target set, by learning an unsupervised loss function parameterized as a neural network. This unsupervised loss function is learned jointly with the parameter initialization and learning scheduler of the model as done in MAML++. SCA produces SOTA results in few-shot learning, further improving the performance of MAML++. Our model is evaluated on Omniglot and Mini-ImageNet where it sets SOTA-level performance. The third proposed method, referred to as Assume, Augment and Learn (AAL) involves sampling pseudo-supervised tasks from an unsupervised training set by leveraging random labels and data augmentation. These tasks can then be used to directly train any few-shot learning model to perform well on a given dataset. We apply our method on MAML++ and ProtoNets on two datasets, Omniglot and Mini-ImageNet where our model produces state-of-the-art (SOTA) results in Omniglot and competitive performance with SOTA methods in Mini-ImageNet.

URI

https://hdl.handle.net/1842/38580
http://dx.doi.org/10.7488/era/1844

This item appears in the following Collection(s)

Informatics thesis and dissertation collection