Edinburgh Research Archive

Learning about the learning process: from active querying to fine-tuning

Item Status

Embargo End Date

Abstract

The majority of research on academic machine learning addresses the core model fitting part of the machine learning workflow. However, prior to model fitting, data collection and annotation is an important step; and subsequently to this, knowledge transfer to different but related problems is also important. Recently, the core model fitting step in this workflow has been upgraded using learning-to-learn methodologies, where learning algorithms are applied to improve the fitting algorithm itself in terms of computation or data efficiency. However, algorithms for data collection and knowledge transfer are still commonly hand-engineered. In this doctoral thesis, we upgrade the pre-and post-processing steps of the machine learning pipeline with the learningto- learn paradigm. We first present novel learning-to-learn approaches that improve the algorithms for this pre-processing step in terms of label efficiency. The inefficiency of data annotation is a common issue in the field: To fit the desired model, a large amount of data is usually collected and annotated, much of which is useless. Active learning aims to address this by selecting the most suitable data for annotation. Since conventional active learning algorithms are hand-engineered and heuristically designed for a specific problem, they typically cannot be adapted across nor even within datasets. The data efficiency of active learning can be improved either by online learning active learning within a specific problem, or by transferring active learning knowledge between related problems. We begin by investigating the framework of leaning active learning online, which learns to select the best criteria for a particular dataset as queries are made. It enables online adaptation, along with the state of the model and dataset changes, while guaranteeing performance. Subsequently, we upgrade the previous framework to a data-driven learning-based approach by learning a transferable active-learning policy end-to-end. The framework is thus capable of directly optimising the accuracy of the underlying classifier, and can adapt to the statistics of any given dataset. More importantly, the learned active-learning policy is domain agnostic and generalises to new learning problems. We next turn to knowledge transfer from a well-learned problem to a novel target problem. We develop a new learning-to-learn technique to improve the effectiveness and efficiency of fine-tuning-based transfer learning. Conventional transfer learning approaches are heuristic: Most commonly, small learning-rate stochastic gradient descent starting from the source model as a condition, and keeping the architecture constant. However, the typical transfer learning pipeline transfers learning from a general model or dataset to a more specific one. Thus, we propose a transfer learning algorithm for neural networks, which simultaneously prune the size of the target networks architecture and updates its weights. This enables the model complexity to be reduced, as training iterations increase, and both efficiency and efficacy are improved compared to conventional fine-tuning knowledge transfer.

This item appears in the following Collection(s)