Edinburgh Research Archive

Advances in scene understanding: object detection, reconstruction, layouts, and inference

dc.contributor.advisor
Ferrari, Vittorio
en
dc.contributor.advisor
Hospedales, Timothy
en
dc.contributor.author
Henderson, Paul Matthew
en
dc.contributor.sponsor
Engineering and Physical Sciences Research Council (EPSRC)
en
dc.date.accessioned
2019-03-26T11:32:17Z
dc.date.available
2019-03-26T11:32:17Z
dc.date.issued
2019-07-01
dc.description.abstract
The goal of scene understanding is to capture the full content of an image in a human-interpretable representation. This must describe the different objects present, including their attributes such as class, shape, and pose, as well as the relations between objects. Moreover, the representation should be globally-consistent across the entire image. In this thesis, we consider four sub-tasks within scene understanding, and make contributions to each. When describing the content of an image, it is natural to start by detecting all the objects that are present—that is, localising and classifying them. Our first contribution is to show how to train a neural-network-based object class detector end-to-end in a principled fashion, using the evaluation metric as the training loss, and using the same pipeline at both training and test time. This is simpler and more elegant than the traditional approach of using a surrogate loss, yet we show it achieves comparable performance. Once the location and class of an object are known, we can estimate its shape and pose in 3D space. Our second contribution is a new approach to these tasks, which supports training purely from 2D images—without 3D supervision, multiple views, or annotations such as pose or keypoints. Moreover, this model is generative, and so allows sampling new object shapes a priori. To produce a globally-consistent description of a scene, it is important to reason over all objects simultaneously, rather than considering each individually. Our third contribution is a probabilistic generative model over complete indoor scene layouts. It models complex arrangements in 3D space, including high-order spatial relations among furniture and other objects. One common approach to generating predictions that are consistent over all objects in a scene, or pixels in an image, is to formulate and solve a discrete energy minimisation problem. The energy is defined as a sum over factors, and the factor structure greatly affects what minimisation algorithms work well. Our fourth contribution is a method that automatically selects a suitable algorithm to solve a given energy minimisation problem. To do so, it learns to predict the best algorithm based on characteristics of the problem instance.
en
dc.identifier.uri
http://hdl.handle.net/1842/35600
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Henderson, P. and Ferrari, V. (2016a). Automatically selecting inference algorithms for discrete energy minimisation. In Proceedings of the European Conference on Computer Vision, pages 235–252. 4, 114
en
dc.relation.hasversion
Henderson, P. and Ferrari, V. (2016b). End-to-end training of object class detectors for mean average precision. In Proceedings of the Asian Conference on Computer Vision, pages 198–213. 3, 9
en
dc.relation.hasversion
Henderson, P. and Ferrari, V. (2018). Learning to generate and reconstruct 3D meshes with only 2D supervision. In Proceedings of the British Machine Vision Conference. 3, 35
en
dc.subject
scene understanding
en
dc.subject
globally-consistent
en
dc.subject
image description
en
dc.subject
neural-network
en
dc.subject
object class detection
en
dc.subject
probabilistic generative models
en
dc.subject
3D space
en
dc.subject
minimisation algorithms
en
dc.title
Advances in scene understanding: object detection, reconstruction, layouts, and inference
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
Henderson2019.pdf
Size:
97.75 MB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)