Learning Dynamics for Robot Control under Varying Contexts
High fidelity, compliant robot control requires a sufficiently accurate dynamics model. Often though, it is not possible to obtain a dynamics model sufficiently accurately or at all using analytical methods. In such cases, an alternative is to learn the dynamics model from movement data. This thesis discusses the problems specific to dynamics learning for control under nonstationarity of the dynamics. We refer to the cause of the nonstationarity as the context of the dynamics. Contexts are, typically, not directly observable. For instance, the dynamics of a robot manipulator changes as the robot manipulates different objects and the physical properties of the load – the context of the dynamics – are not directly known by the controller. Other examples of contexts that affect the dynamics are changing force fields or liquids with different viscosity in which a manipulator has to operate. The learned dynamics model needs to be adapted whenever the context and therefore the dynamics changes. Inevitably, performance drops during the period of adaptation. The goal of this work, is to reuse and generalize the experience obtained by learning the dynamics of different contexts in order to adapt to changing contexts fast. We first examine the case that the dynamics may switch between a discrete, finite set of contexts and use multiple models and switching between them to adapt the controller fast. A probabilistic formulation of multiple models is used, where a discrete latent variable is used to represent the unobserved context and index the models. In comparison to previous multiple model approaches, the developed method is able to learn multiple models of nonlinear dynamics, using an appropriately modified EM algorithm. We also deal with the case when there exists a continuum of possible contexts that affect the dynamics and hence, it becomes essential to generalize from a set of experienced contexts to novel contexts. There is very little previous work on this direction and the developed methods are completely novel. We introduce a set of continuous latent variables to represent context and introduce a dynamics model that depends on this set of variables. We first examine learning and inference in such a model when there is strong prior knowledge on the relationship of these continuous latent variables to the modulation of the dynamics, e.g., when the load at the end effector changes. We also develop methods for the case that there is no such knowledge available. Finally, we formulate a dynamics model whose input is augmented with observed variables that convey contextual information indirectly, e.g., the information from tactile sensors at the interface between the load and the arm. This approach also allows generalization to not previously seen contexts and is applicable when the nature of the context is not known. In addition, we show that use of such a model is possible even when special sensory input is not available by using an instance of an autoregressive model. The developed methods are tested on realistic, full physics simulations of robot arm systems including a simplistic 3 degree of freedom (DOF) arm and a simulation of the 7 DOF DLR light weight robot arm. In the experiments, varying contexts are different manipulated objects. Nevertheless, the developed methods (with the exception of the methods that require prior knowledge on the relationship of the context to the modulation of the dynamics) are more generally applicable and could be used to deal with different context variation scenarios.