Exploiting diversity for efficient machine learning
dc.contributor.advisor
Sutton, Charles
en
dc.contributor.advisor
Storkey, Amos
en
dc.contributor.author
Geras, Krzysztof Jerzy
en
dc.contributor.sponsor
Engineering and Physical Sciences Research Council (EPSRC)
en
dc.date.accessioned
2018-03-16T12:08:49Z
dc.date.available
2018-03-16T12:08:49Z
dc.date.issued
2018-07-02
dc.description.abstract
A common practice for solving machine learning problems is currently to consider
each problem in isolation, starting from scratch every time a new learning problem
is encountered or a new model is proposed. This is a perfectly feasible solution
when the problems are sufficiently easy or, if the problem is hard when a large
amount of resources, both in terms of the training data and computation, are
available. Although this naive approach has been the main focus of research in
machine learning for a few decades and had a lot of success, it becomes infeasible
if the problem is too hard in proportion to the available resources. When using
a complex model in this naive approach, it is necessary to collect large data
sets (if possible at all) to avoid overfitting and hence it is also necessary to use
large computational resources to handle the increased amount of data, first during
training to process a large data set and then also at test time to execute a complex
model.
An alternative to this strategy of treating each learning problem independently
is to leverage related data sets and computation encapsulated in previously
trained models. By doing that we can decrease the amount of data necessary to
reach a satisfactory level of performance and, consequently, improve the accuracy
achievable and decrease training time. Our attack on this problem is to exploit
diversity - in the structure of the data set, in the features learnt and in the
inductive biases of different neural network architectures.
In the setting of learning from multiple sources we introduce multiple-source
cross-validation, which gives an unbiased estimator of the test error when the data
set is composed of data coming from multiple sources and the data at test time are
coming from a new unseen source. We also propose new estimators of variance of
the standard k-fold cross-validation and multiple-source cross-validation, which
have lower bias than previously known ones.
To improve unsupervised learning we introduce scheduled denoising autoencoders,
which learn a more diverse set of features than the standard denoising
auto-encoder. This is thanks to their training procedure, which starts with a
high level of noise, when the network is learning coarse features and then the
noise is lowered gradually, which allows the network to learn some more local
features. A connection between this training procedure and curriculum learning
is also drawn. We develop further the idea of learning a diverse representation
by explicitly incorporating the goal of obtaining a diverse representation into the
training objective. The proposed model, the composite denoising autoencoder,
learns multiple subsets of features focused on modelling variations in the data set
at different levels of granularity.
Finally, we introduce the idea of model blending, a variant of model compression,
in which the two models, the teacher and the student, are both strong
models, but different in their inductive biases. As an example, we train convolutional
networks using the guidance of bidirectional long short-term memory
(LSTM) networks. This allows to train the convolutional neural network to be
more accurate than the LSTM network at no extra cost at test time.
en
dc.identifier.uri
http://hdl.handle.net/1842/28839
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Gregor Urban, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan, Shengjie Wang, Rich Caruana, Abdel rahman Mohamed, Matthai Philipose, and Matthew Richardson. Do deep convolutional nets really need to be deep (or even convolutional)? In ICLR (workshop track), 2016.
en
dc.relation.hasversion
Krzysztof J. Geras and Charles Sutton. Multiple-source cross-validation. In International Conference on Machine Learning, 2013.
en
dc.relation.hasversion
Krzysztof J. Geras and Charles Sutton. Scheduled denoising autoencoders. In International Conference on Learning Representations, 2015.
en
dc.relation.hasversion
Krzysztof J. Geras, Abdel-rahman Mohamed, Rich Caruana, Gregor Urban, Shengjie Wang, Ozlem Aslan, Matthai Philipose, Matthew Richardson, and Charles Sutton. Blending lstms into cnns. In International Conference on Learning Representations (workshop track), 2016.
en
dc.relation.hasversion
Krzysztof J. Geras and Charles Sutton. Composite denoising autoencoders. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.
en
dc.subject
machine learning
en
dc.subject
deep learning
en
dc.subject
cross-validation
en
dc.subject
autoencoders
en
dc.subject
convolutional neural networks
en
dc.subject
model compression
en
dc.title
Exploiting diversity for efficient machine learning
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Geras2018.pdf
- Size:
- 2.85 MB
- Format:
- Adobe Portable Document Format
This item appears in the following Collection(s)

