Dynamic Trees: A Hierarchical Probabilistic Approach to Image Modelling
Adams, Nicholas J.
This work introduces a new class of image model which we call dynamic trees or DTs. A dynamic tree model specifies a prior over structures of trees, each of which is a forest of one or more tree-structured belief networks (TSBN). In the literature standard tree-structured belief network models were found to produce “blocky” segmentations when naturally occurring boundaries within an image did not coincide with those of the subtrees in the rigid fixed structure of the network. Dynamic trees have a flexible architecture which allows the structure to vary to accommodate configurations where the subtree and image boundaries align, and experimentation with the model showed significant improvements. They are also hierarchical in nature allowing a multi-scale representation and are constructed within a well founded Bayesian framework. For large models the number of tree configurations quickly becomes intractable to enumerate over, presenting a problem for exact inference. Techniques such as Gibbs sampling over trees are considered and search using simulated annealing finds high posterior probability trees on synthetic 2-d images generated from the model. However simulated annealing and sampling techniques are rather slow. Variational methods are applied to the model in an attempt to approximate the posterior by a simpler tractable distribution, and the simplest of these techniques, mean field, found comparable solutions to simulated annealing in the order of 100 times faster. This increase in speed goes a long way towards making real-time inference in the dynamic tree viable. Variational methods have the further advantage that by attempting to model the full posterior distribution it is possible to gain an indication as to the quality of the solutions found. An EM-style update based upon mean field inference is derived and the learned conditional probability tables (describing state transitions between a node and its parent) are compared with exact EM on small tractable fixed architecture models. The mean field approximation by virtue of its form is biased towards fully factorised solutions which tends to create degenerate CPTs, but despite this mean field learning still produces solutions whose log likelihood rivals exact EM. Development of algorithms for learning the probabilities of the prior over tree structures completes the dynamic tree picture. After discussion of the relative merits of certain representations for the disconnection probabilities and initial investigation on small model structures the full dynamic tree model is applied to a database of images of outdoor scenes where all of its parameters are learned. DTs are seen to offer significant improvement in performance over the fixed architecture TSBN and in a coding comparison the DT achieves 0 294 bits per pixel (bpp) compression compared to 0 378 bpp for lossless JPEG on images of 7 colours.