Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Bayesian Multisensory Perception

View/Open
thesis.pdf (5.995Mb)
Date
24/06/2008
Author
Hospedales, Timothy
Metadata
Show full item record
Abstract
A key goal for humans and artificial intelligence systems is to develop an accurate and unified picture of the outside world based on the data from any sense(s) that may be available. The availability of multiple senses presents the perceptual system with new opportunities to fulfil this goal, but exploiting these opportunities first requires the solution of two related tasks. The first is how to make the best use of any redundant information from the sensors to produce the most accurate percept of the state of the world. The second is how to interpret the relationship between observations in each modality; for example, the correspondence problem of whether or not they originate from the same source. This thesis investigates these questions using ideal Bayesian observers as the underlying theoretical approach. In particular, the latter correspondence task is treated as a problem of Bayesian model selection or structure inference in Bayesian networks. This approach provides a unified and principled way of representing and understanding the perceptual problems faced by humans and machines and their commonality. In the domain of machine intelligence, we exploit the developed theory for practical benefit, developing a model to represent audio-visual correlations. Unsupervised learning in this model provides automatic calibration and user appearance learning, without human intervention. Inference in the model involves explicit reasoning about the association between latent sources and observations. This provides audio-visual tracking through occlusion with improved accuracy compared to standard techniques. It also provides detection, verification and speech segmentation, ultimately allowing the machine to understand ``who said what, where?'' in multi-party conversations. In the domain of human neuroscience, we show how a variety of recent results in multimodal perception can be understood as the consequence of probabilistic reasoning about the causal structure of multimodal observations. We show this for a localisation task in audio-visual psychophysics, which is very similar to the task solved by our machine learning system. We also use the same theory to understand results from experiments in the completely different paradigm of oddity detection using visual and haptic modalities. These results begin to suggest that the human perceptual system performs -- or at least approximates -- sophisticated probabilistic reasoning about the causal structure of observations under the hood.
URI
http://hdl.handle.net/1842/2156
Collections
  • Informatics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page