World of probabilities: a molecular dynamics and Markov state modelling approach for rational design of allosteric modulators
Item Status
Embargo End Date
Date
Authors
Hardie, Adele
Abstract
Even with current scientific and technological advances, drug discovery is a lengthy
and expensive process. With a large number of pharmaceuticals already on the
market and increasingly stricter regulations, it is difficult to design compounds that
either are a significant improvement on the existing drugs or aimed at a novel target.
In the light of this, allosteric modulators are a source of novelty in the field of drug
discovery. Allosteric sites, i.e. sites that are distinct from the active site, tend to
have high variety and low conservation even between proteins of the same family.
Designing allosteric, rather than orthosteric, modulators allows for improved drug
profiles, new ways of drugging already targeted proteins, and even revisiting targets
previously deemed undruggable.
Aided by progress in structural biology and computing power available, computer
aided drug design methods are heavily utilized in the study of allosteric modulation.
There are multiple allosteric pocket detection and residue network analysis tools
available to the computational chemist, however the effect a ligand binding to an
allosteric site might have on the protein conformational ensemble remains difficult
to quantify. Approaches using machine learning and Markov modelling have been in
development, however they require the use of molecular dynamics (MD) simulations
that are currently too time consuming for practical applications.
This thesis contains the development and application of a joint steered MD
(sMD) and Markov State Modelling (MSM) approach, to reduce the computational
time required to sample relevant conformational space of the protein. In this workflow,
sMD simulations are used to bias the protein system from functionally “active”
to “inactive” conformations, and vice versa. From the sMD trajectories, a range of
protein conformations is sampled, including unstable intermediate conformations
not routinely accessible via standard MD methods. Each of these conformations
serves as a new starting point for a swarm of unbiased MD simulations, allowing
this methodology to leverage the increasingly available parallel computing infrastructures.
These “seeded” MD simulations are combined to build MSMs, which
describe the protein conformational ensemble. The MSMs are modelled in parallel,
so that the probability values of states can be directly comparable across MSMs.
The state probabilities of a protein system with no potential allosteric modulators
are used as a baseline, and ligands are characterized based on the changes they
induce. If the presence of a ligand decreases the probability of a state defined as
“active”, the ligand is therefore an allosteric modulator. On the other hand, if the
ligand increases the probability of this state, it is an activator.
The main body of this thesis consists of application of the above methodology
to three protein systems: Protein Tyrosine Phosphatase 1B (PTP1B), Exchange
Protein directly Activated by CAMP 1 (EPAC1), and Polycystic Kidney Disease 2
(PKD). Each system highlights a different class of drug target and activation mechanism.
Additionally, each chapter emphasizes various considerations and caveats of
applying sMD/MSMs to allosteric modulator assessment. Firstly, the workflow is
validated for the first time on known inhibitors of PTP1B. The inhibitors target two
distinct allosteric sites, and the trends in the experimentally measured inhibition are
captured by the MSM modelled state probabilities. Additionally, the importance of
comprehensively describing the protein conformational changes during sMD is discussed.
The different effects of the ligands on PTP1B activity are also related to the
different protein-ligand interactions observed in molecular dynamics simulations.
Secondly, the approach is applied to EPAC1, this time modelling activation by
cAMP and partial activation by compound I942. Furthermore, while the function
of PTP1B was defined by small loop motions, the activation of EPACs involves a
large domain rearrangement and a two-step mechanism. A three-state conformational
ensemble model is discussed for EPAC1, capturing activation by cAMP and
partial activation by I942. As the description of protein dynamics in three states
is more complex, data-driven method metastable state partitioning is less reliable.
The state assignment was done manually, based on knowledge of EPACs activation,
highlighting the non-triviality of biologically relevant state assignment. To investigate
the differences between cAMP and I942, the latter is modelled with a variety
of restraints that mimic protein-ligand interactions observed with cAMP. This allows
to make MD-guided suggestions to further I942 lead development into a full
activator.
Finally, the above sMD/MSM methodology is applied to PKD2, illustrating a
more complex scenario where less data is available. Multiple considerations were
taken when modelling PKD2, such as simulating a membrane and truncating the
protein. As no small molecule modulators of PKD2 are known, the goal of this
chapter is to investigate the regulatory effect of PI(4,5)P2 membrane lipid on PKD2.
As a control, the activation by a gain-of-function mutant was modelled first, followed
by inhibition by PI(4,5)P2. This example gives insight into future scalability of the
sMD/MSM workflow presented in this thesis, and considerations for application to
real-life drug design projects.
This item appears in the following Collection(s)

