Variational inference for Gaussian-jump processes with application in gene regulation
In the last decades, the explosion of data from quantitative techniques has revolutionised our understanding of biological processes. In this scenario, advanced statistical methods and algorithms are becoming fundamental to decipher the dynamics of biochemical mechanisms such those involved in the regulation of gene expression. Here we develop mechanistic models and approximate inference techniques to reverse engineer the dynamics of gene regulation, from mRNA and/or protein time series data. We start from an existent variational framework for statistical inference in transcriptional networks. The framework is based on a continuous-time description of the mRNA dynamics in terms of stochastic differential equations, which are governed by latent switching variables representing the on/off activity of regulating transcription factors. The main contributions of this work are the following. We speeded-up the variational inference algorithm by developing a method to compute a posterior approximate distribution over the latent variables using a constrained optimisation algorithm. In addition to computational benefits, this method enabled the extension to statistical inference in networks with a combinatorial model of regulation. A limitation of this framework is the fact that inference is possible only in transcriptional networks with a single-layer architecture (where a single or couples of transcription factors regulate directly an arbitrary number of target genes). The second main contribution in this work is the extension of the inference framework to hierarchical structures, such as feed-forward loop. In the last contribution we define a general structure for transcription-translation networks. This work is important since it provides a general statistical framework to model complex dynamics in gene regulatory networks. The framework is modular and scalable to realistically large systems with general architecture, thus representing a valuable alternative to traditional differential equation models. All models are embedded in a Bayesian framework; inference is performed using a variational approach and compared to exact inference where possible. We apply the models to the study of different biological systems, from the metabolism in E. coli to the circadian clock in the picoalga O. tauri.