Meta learning for supervised and unsupervised few-shot learning
Item Status
Embargo End Date
Date
Authors
Antoniou, Antreas
Abstract
Meta-learning or learning-to-learn involves automatically learning training-algorithms
such that models trained with such learnt algorithms can solve a number of tasks while
demonstrating high performance in a number of predefined objectives. Meta-learning
has been used successfully in a large variety of application areas. In this thesis, we
present three meta-learning-based methods that can very effectively tackle the problem
of few-shot learning, where a model needs to be learned with only a very small amount
of available training samples for each concept to be learned (e.g. 1-5 samples for each
concept). Two of the proposed methods are trained using supervised learning, and
another involves using self-supervised learning to achieve unsupervised meta-learning.
The proposed methods build on the Model Agnostic Meta-Learning (MAML) framework. MAML is a gradient-based meta-learning method, where one learns a parameter
initialization for a model, such that after the model has been updated a number of times
towards a small training set (i.e. a support set), it can perform very well on previously
unseen instances of the classes it was trained on, usually referred to as a small validation (i.e. a target set). The initialization is learned by utilizing two levels of learning. One inner-most level where the initialization is updated towards the support set
and evaluated on the target set, thus generating an objective directly quantifying the
generalization performance of the inner-loop model (i.e. the target-set loss), and an
outer-most level where the parameter-initialization is learned using the gradients with
respect to the target-set loss.
Our first method, referred to as MAML++, improves the highly unstable MAML
method using a number of modifications in the batch normalization layers, the outer
loop loss formulation and the formulation of the learning scheduler used on the inner
loop. Not only does MAML++ enable MAML to converge more reliably and efficiently, but it also improves the model’s generalization performance. We evaluate our
method on Omniglot and Mini-ImageNet, where our model showcases vastly improved
convergence speed, stability and generalization performance.
The second proposed method, referred to as Self-Critique and Adapt (SCA), builds
on MAML++ by allowing the inner loop model to adapt itself on the unsupervised
target set, by learning an unsupervised loss function parameterized as a neural network.
This unsupervised loss function is learned jointly with the parameter initialization and
learning scheduler of the model as done in MAML++. SCA produces SOTA results
in few-shot learning, further improving the performance of MAML++. Our model is
evaluated on Omniglot and Mini-ImageNet where it sets SOTA-level performance.
The third proposed method, referred to as Assume, Augment and Learn (AAL)
involves sampling pseudo-supervised tasks from an unsupervised training set by leveraging random labels and data augmentation. These tasks can then be used to directly
train any few-shot learning model to perform well on a given dataset. We apply our
method on MAML++ and ProtoNets on two datasets, Omniglot and Mini-ImageNet
where our model produces state-of-the-art (SOTA) results in Omniglot and competitive performance with SOTA methods in Mini-ImageNet.
This item appears in the following Collection(s)

