Multi-task learning with Gaussian processes
Chai, Kian Ming
Multi-task learning refers to learning multiple tasks simultaneously, in order to avoid tabula rasa learning and to share information between similar tasks during learning. We consider a multi-task Gaussian process regression model that learns related functions by inducing correlations between tasks directly. Using this model as a reference for three other multi-task models, we provide a broad unifying view of multi-task learning. This is possible because, unlike the other models, the multi-task Gaussian process model encodes task relatedness explicitly. Each multi-task learning model generally assumes that learning multiple tasks together is beneficial. We analyze how and the extent to which multi-task learning helps improve the generalization of supervised learning. Our analysis is conducted for the average-case on the multi-task Gaussian process model, and we concentrate mainly on the case of two tasks, called the primary task and the secondary task. The main parameters are the degree of relatedness ρ between the two tasks, and πS, the fraction of the total training observations from the secondary task. Among other results, we show that asymmetric multitask learning, where the secondary task is to help the learning of the primary task, can decrease a lower bound on the average generalization error by a factor of up to ρ2πS. When there are no observations for the primary task, there is also an intrinsic limit to which observations for the secondary task can help the primary task. For symmetric multi-task learning, where the two tasks are to help each other to learn, we find the learning to be characterized by the term πS(1 − πS)(1 − ρ2). As far as we are aware, our analysis contributes to an understanding of multi-task learning that is orthogonal to the existing PAC-based results on multi-task learning. For more than two tasks, we provide an understanding of the multi-task Gaussian process model through structures in the predictive means and variances given certain configurations of training observations. These results generalize existing ones in the geostatistics literature, and may have practical applications in that domain. We evaluate the multi-task Gaussian process model on the inverse dynamics problem for a robot manipulator. The inverse dynamics problem is to compute the torques needed at the joints to drive the manipulator along a given trajectory, and there are advantages to learning this function for adaptive control. A robot manipulator will often need to be controlled while holding different loads in its end effector, giving rise to a multi-context or multi-load learning problem, and we treat predicting the inverse dynamics for a context/load as a task. We view the learning of the inverse dynamics as a function approximation problem and place Gaussian process priors over the space of functions. We first show that this is effective for learning the inverse dynamics for a single context. Then, by placing independent Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task Gaussian process prior for handling multiple loads, where the inter-context similarity depends on the underlying inertial parameters of the manipulator. Experiments demonstrate that this multi-task formulation is effective in sharing information among the various loads, and generally improves performance over either learning only on single contexts or pooling the data over all contexts. In addition to the experimental results, one of the contributions of this study is showing that the multi-task Gaussian process model follows naturally from the physics of the inverse dynamics.