dc.contributor.advisor | Michel, Julien | |
dc.contributor.advisor | Mey, Antonia | |
dc.contributor.advisor | Kirrander, Adam | |
dc.contributor.author | Scheen, Jenke | |
dc.date.accessioned | 2023-05-02T10:52:17Z | |
dc.date.available | 2023-05-02T10:52:17Z | |
dc.date.issued | 2023-05-02 | |
dc.identifier.uri | https://hdl.handle.net/1842/40545 | |
dc.identifier.uri | http://dx.doi.org/10.7488/era/3311 | |
dc.description.abstract | The work presented in this thesis resides at the interface of alchemical free energy
methods (AFE) and machine-learning (ML) in the context of computer-aided drug
discovery (CADD). The majority of the work consists of explorations into regions
of synergy between the individual parts. The overarching hypothesis behind this
work is that although areas of high potential exist for standalone ML and AFE in
CADD, an additional source of value can be found in areas where ML and AFE are
combined in such a way that the new methodology profits from key strengths in
either part.
Physics-based AFE calculations have - over several decades - grown into precise
and accurate sub-kcal·mol−1
(in terms of mean absolute error versus experimental
measures) methods of predicting ligand-protein binding affinities which is the main
driver of its popularity in project support in drug design workflows. Data-driven
ML methods have seen a similar rapid development spurred by the exponential
growth in computational hardware capabilities, but are generally still lacking in
accuracy versus experimental measures of binding affinities to support drug design
work. Contrastingly, however, the first relies mainly on physical rules in the form
of statistical mechanics and the latter profits from interpolating signals within large
training domains of data.
After a historical and theoretical introduction into drug discovery, AFE calculations
and ML methods, the thesis will highlight several studies that reflect the above hypothesis along multiple key points in the AFE workflow. Firstly, a methodology that combines AFE with ML has been developed to compute accurate absolute hydration free energies. The hybrid AFE/ML methodology
was trained on a subset of the FreeSolv database, and retrospectively shown to
outperform most submissions from the SAMPL4 competition. Compared to pure
machine-learning approaches, AFE/ML yields more precise estimates of free energies
of hydration, and requires a fraction of the training set size to outperform standalone
AFE calculations. The ML-derived correction terms are further shown to be transferable to a range of related AFE simulation protocols. The approach may be used
to inexpensively improve the accuracy of AFE calculations, and to flag molecules
which will benefit the most from bespoke force field parameterisation efforts.
Secondly, early investigations into data-driven AFE network generators has been
performed. Because AFE calculations make use of alchemical transformations between ligands in congeneric series, practitioners are required to estimate an optimal
combination of transformations for each series. AFE networks constitute the collection of edges chosen such that all ligands (nodes) are included in the network and
where each edge is a AFE calculation. As there are a vast number of possible configurations for such networks this step in AFE setup suffers from several shortcomings
such as scalability and transferability between AFE softwares.
Although AFE network generation has been automated in the past, the algorithm
depends mostly on expert-driven estimation of AFE transformation reliabilities.
This work presents a first iteration of a data-driven alternative to the state-of-the-art using a graph siamese neural network architecture. A novel dataset, RBFE Space, is presented as a representative and transferable training domain for AFE
ML research. The workflow presented in this thesis matches state-of-the-art AFE
network generation performance with several key benefits. The workflow provides
full transferability of the network generator because RBFE-Space is open-sourced
and ready to be applied to other AFE softwares. Additionally, the deep learning
model represents the first robust ML predictor of transformation reliabilities in AFE
calculations. Finally, one major shortcoming of AFE calculations is its decreased reliability for
transformations that are larger than ∼5 heavy atoms. The work reported in this
thesis describes investigations into whether running charge, Van der Waals and bond
parameter transformations individually (with variable λ allocation per step) offers an
advantage to transforming all parameters in a single step, as is the current standard
in most AFE workflows. Initial results in this work qualitatively suggest that the
bound leg benefits from a MultiStep protocol over a onestep (”SoftCore”) protocol,
whereas the free leg does not show benefit. Further work was performed by Cresset
that showed no observable benefit of the MultiStep approach over the Softcore approach. Several key findings are reported in this work that illustrate the benefits of
dissecting an FEP approach and comparing the two approaches side-by-side. | en |
dc.language.iso | en | en |
dc.publisher | The University of Edinburgh | en |
dc.relation.hasversion | Best Practices for Alchemical Free Energy Calculations A. S. J. S. Mey, B. K. Allen, H. E. Bruce McDonald, J. D. Chodera, D. F. Hahn, M. Kuhn, J. Michel, D. L. Mobley, L. N. Naden, S. Prasad, A. Rizzi, J. Scheen, M. R. Shirts, G. Tresadern and H. Xu Living Journal of Computational Molecular Science, 2020, 2, 18378 | en |
dc.relation.hasversion | Hybrid Alchemical Free Energy/Machine-Learning Methodology for the Computation of Hydration Free Energies J. Scheen, W. Wu, A. S. J. S. Mey, P. Tosco, M. Mackey and J. Michel Journal of Chemical Information and Modeling, 2020, 60, 5331–5339 | en |
dc.relation.hasversion | Data-driven Generation of Perturbation Networks for Relative Bind ing Free Energy Calculations J. Scheen, M. Mackey and J. Michel Digital Discovery, 2022, 1, 870-885 | en |
dc.subject | computational chemistry | en |
dc.subject | computer-aided drug design | en |
dc.subject | CADD | en |
dc.subject | AFE science | en |
dc.title | Applications of artificial intelligence to alchemical free energy calculations in contemporary drug design | en |
dc.type | Thesis or Dissertation | en |
dc.type.qualificationlevel | Doctoral | en |
dc.type.qualificationname | PhD Doctor of Philosophy | en |