Computational methods for RNA integrative biology
Item Status
Embargo End Date
Date
Authors
Abstract
Ribonucleic acid (RNA) is an essential molecule, which carries out a wide variety
of functions within the cell, from its crucial involvement in protein synthesis to
catalysing biochemical reactions and regulating gene expression. Such diverse functional
repertoire is indebted to complex structures that RNA can adopt and its flexibility
as an interacting molecule.
It has become possible to experimentally measure these two crucial aspects of RNA
regulatory role with such technological advancements as next-generation sequencing
(NGS). NGS methods can rapidly obtain the nucleotide sequence of many molecules
in parallel. Designing experiments, where only the desired parts of the molecule (or
specific parts of the transcriptome) are sequenced, allows to study various aspects
of RNA biology. Analysis of NGS data is insurmountable without computational
methods.
One such experimental method is RNA structure probing, which aims to infer RNA
structure from sequencing chemically altered transcripts. RNA structure probing data
is inherently noisy, affected both by technological biases and the stochasticity of the
underlying process. Most existing methods do not adequately address the issue of
noise, resorting to heuristics and limiting the informativeness of their output. In this
thesis, a statistical pipeline was developed for modelling RNA structure probing data,
which explicitly captures biological variability, provides automated bias-correcting
strategies, and generates a probabilistic output based on experimental measurements.
The output of our method agrees with known RNA structures, can be used to constrain
structure prediction algorithms, and remains robust to reduced sequence coverage,
thereby increasing sensitivity of the technology.
Another recent experimental innovation maps RNA-protein interactions at very
high temporal resolution, making it possible to study rapid binding events happening
on a minute time scale. In this thesis, a non-parametric algorithm was developed for
identifying significant changes in RNA-protein binding time-series between different
conditions. The method was applied to novel yeast RNA-protein binding time-course
data to study the role of RNA degradation in stress response. It revealed pervasive
changes in the binding to the transcriptome of the yeast transcription termination
factor Nab3 and the cytoplasmic exoribonuclease Xrn1 under nutrient stress. This
challenged the common assumption of viewing transcriptional changes as the major
driver of changes in RNA expression during stress and highlighted the importance of
degradation. These findings inspired a dynamical model for RNA expression, where
transcription and degradation rates are modelled using RNA-protein binding time-series
data.
This item appears in the following Collection(s)

