Unsupervised category-level viewpoint estimation

Mariotti, Octave

Unsupervised category-level viewpoint estimation

Simple item page

dc.contributor.advisor

Bilen, Hakan

dc.contributor.advisor

Gutmann, Michael Urs

dc.contributor.author

Mariotti, Octave

dc.date.accessioned

2023-04-25T11:11:49Z

dc.date.available

2023-04-25T11:11:49Z

dc.date.issued

2023-04-25

dc.description.abstract

The recent progress in deep learning techniques transformed the field of computer vision, with tasks like object classification or segmentation being almost considered solved. This however requires sufficiently many labeled samples to train the system, hence research focus has shifted towards tasks where collecting such data is challenging. Recovering camera poses is one such task, where labels are typically too costly for supervised approaches. This work explores solutions to train camera pose estimation systems without the need for external supervision. Preliminary assessments show that it is possible to formulate this problem as a self supervised reconstruction task. By interpreting a network output as 3D rotation, and using this output to control a differentiable rendering operation, gradient descent can be used to train the network to predict viewpoint information. However, multiple issues arise when applying such a method naively on complex data. Confounding factors of particular importance are symmetries, geometry-breaking rendering pipelines and background induced noise. This leads to a regime where purely self-supervised training breaks, al though semi-supervised approaches are still successful. Specific solutions to the aforementioned problems are therefore studied and evaluated. For symmetries, multiple viewpoint predictions are made, and their distribution is further regulated. Two main rendering pipelines are also compared to improve over naive convolution-based reconstruction: a voxel-based one, and a more recent implicit neural representation. Experimental evidence shows that carefully crafting a system with these improvements allows recovery of poses on many everyday objects, such as cars and chairs, with performances reaching the level of supervised approaches on some categories. In addition, this thesis underlines two potential problems in related approaches. First, an unstable pose retrieval method used in recent implicit representations, that is prohibitively expensive. Second, an insidious issue in unsupervised methods, arising from a combination of dataset biases and naive calibration. As this potentially leads to overestimated performances, it calls for a more robust evaluation standard, as well as more careful data gathering.

en

dc.identifier.uri

https://hdl.handle.net/1842/40529

dc.identifier.uri

http://dx.doi.org/10.7488/era/3295

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Mariotti, O. and Bilen, H. (2020). Semi-supervised viewpoint estimation with geometry aware conditional generation. In European Conference on Computer Vision, pages 631–647. Springer

en

dc.relation.hasversion

Mariotti, O., Mac Aodha, O., and Bilen, H. (2021). Viewnet: Unsupervised viewpoint es timation from conditional generation. In International Conference on Computer Vision, pages 10418–10428.

en

dc.relation.hasversion

Mariotti, O., Mac Aodha, O., and Bilen, H. (2022). Viewnerf: Unsupervised viewpoint es timation using category-level neural radiance fields. arXiv preprint arXiv:2212.00436

en

dc.subject

deep learning

en

dc.subject

camera pose estimation systems

en

dc.subject

symmetries

en

dc.subject

multiple viewpoint predictions

en

dc.subject

unstable pose retrieval methods

en

dc.subject

dataset biases

en

dc.title

Unsupervised category-level viewpoint estimation

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mariotti2023.pdf
Size:: 8.41 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection