Approximation methods and inference for stochastic biochemical kinetics
Schnoerr, David Benjamin
Recent experiments have shown the fundamental role that random fluctuations play in many chemical systems in living cells, such as gene regulatory networks. Mathematical models are thus indispensable to describe such systems and to extract relevant biological information from experimental data. Recent decades have seen a considerable amount of modelling effort devoted to this task. However, current methodologies still present outstanding mathematical and computational hurdles. In particular, models which retain the discrete nature of particle numbers incur necessarily severe computational overheads, greatly complicating the tasks of characterising statistically the noise in cells and inferring parameters from data. In this thesis we study analytical approximations and inference methods for stochastic reaction dynamics. The chemical master equation is the accepted description of stochastic chemical reaction networks whenever spatial effects can be ignored. Unfortunately, for most systems no analytic solutions are known and stochastic simulations are computationally expensive, making analytic approximations appealing alternatives. In the case where spatial effects cannot be ignored, such systems are typically modelled by means of stochastic reaction-diffusion processes. As in the non-spatial case an analytic treatment is rarely possible and simulations quickly become infeasible. In particular, the calibration of models to data constitutes a fundamental unsolved problem. In the first part of this thesis we study two approximation methods of the chemical master equation; the chemical Langevin equation and moment closure approximations. The chemical Langevin equation approximates the discrete-valued process described by the chemical master equation by a continuous diffusion process. Despite being frequently used in the literature, it remains unclear how the boundary conditions behave under this transition from discrete to continuous variables. We show that this boundary problem results in the chemical Langevin equation being mathematically ill-defined if defined in real space due to the occurrence of square roots of negative expressions. We show that this problem can be avoided by extending the state space from real to complex variables. We prove that this approach gives rise to real-valued moments and thus admits a probabilistic interpretation. Numerical examples demonstrate better accuracy of the developed complex chemical Langevin equation than various real-valued implementations proposed in the literature. Moment closure approximations aim at directly approximating the moments of a process, rather then its distribution. The chemical master equation gives rise to an infinite system of ordinary differential equations for the moments of a process. Moment closure approximations close this infinite hierarchy of equations by expressing moments above a certain order in terms of lower order moments. This is an ad hoc approximation without any systematic justification, and the question arises if the resulting equations always lead to physically meaningful results. We find that this is indeed not always the case. Rather, moment closure approximations may give rise to diverging time trajectories or otherwise unphysical behaviour, such as negative mean values or unphysical oscillations. They thus fail to admit a probabilistic interpretation in these cases, and care is needed when using them to not draw wrong conclusions. In the second part of this work we consider systems where spatial effects have to be taken into account. In general, such stochastic reaction-diffusion processes are only defined in an algorithmic sense without any analytic description, and it is hence not even conceptually clear how to define likelihoods for experimental data for such processes. Calibration of such models to experimental data thus constitutes a highly non-trivial task. We derive here a novel inference method by establishing a basic relationship between stochastic reaction-diffusion processes and spatio-temporal Cox processes, two classes of models that were considered to be distinct to each other to this date. This novel connection naturally allows to compute approximate likelihoods and thus to perform inference tasks for stochastic reaction-diffusion processes. The accuracy and efficiency of this approach is demonstrated by means of several examples. Overall, this thesis advances the state of the art of modelling methods for stochastic reaction systems. It advances the understanding of several existing methods by elucidating fundamental limitations of these methods, and several novel approximation and inference methods are developed.