1
|
Wan S, Coveney PV. Introduction to Computational Biomedicine. Methods Mol Biol 2024; 2716:1-13. [PMID: 37702933 DOI: 10.1007/978-1-0716-3449-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
The domain of computational biomedicine is a new and burgeoning one. Its areas of concern cover all scales of human biology, physiology, and pathology, commonly referred to as medicine, from the genomic to the whole human and beyond, including epidemiology and population health. Computational biomedicine aims to provide high-fidelity descriptions and predictions of the behavior of biomedical systems of both fundamental scientific and clinical importance. Digital twins and virtual humans aim to reproduce the extremely accurate duplicate of real-world human beings in cyberspace, which can be used to make highly accurate predictions that take complicated conditions into account. When that can be done reliably enough for the predictions to be actionable, such an approach will make an impact in the pharmaceutical industry by reducing or even replacing the extremely laboratory-intensive preclinical process of making and testing compounds in laboratories, and in clinical applications by assisting clinicians to make diagnostic and treatment decisions.
Collapse
Affiliation(s)
- Shunzhou Wan
- Department of Chemistry, Centre for Computational Science, University College London, London, UK
| | - Peter V Coveney
- Department of Chemistry, Centre for Computational Science, University College London, London, UK.
- Advanced Research Computing Centre, University College London, London, UK.
- Computational Science Laboratory, Institute for Informatics, Faculty of Science, University of Amsterdam, Amsterdam, the Netherlands.
| |
Collapse
|
2
|
Wan S, Bhati AP, Coveney PV. Comparison of Equilibrium and Nonequilibrium Approaches for Relative Binding Free Energy Predictions. J Chem Theory Comput 2023; 19:7846-7860. [PMID: 37862058 PMCID: PMC10653111 DOI: 10.1021/acs.jctc.3c00842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Indexed: 10/21/2023]
Abstract
Alchemical relative binding free energy calculations have recently found important applications in drug optimization. A series of congeneric compounds are generated from a preidentified lead compound, and their relative binding affinities to a protein are assessed in order to optimize candidate drugs. While methods based on equilibrium thermodynamics have been extensively studied, an approach based on nonequilibrium methods has recently been reported together with claims of its superiority. However, these claims pay insufficient attention to the basis and reliability of both methods. Here we report a comparative study of the two approaches across a large data set, comprising more than 500 ligand transformations spanning in excess of 300 ligands binding to a set of 14 diverse protein targets. Ensemble methods are essential to quantify the uncertainty in these calculations, not only for the reasons already established in the equilibrium approach but also to ensure that the nonequilibrium calculations reside within their domain of validity. If and only if ensemble methods are applied, we find that the nonequilibrium method can achieve accuracy and precision comparable to those of the equilibrium approach. Compared to the equilibrium method, the nonequilibrium approach can reduce computational costs but introduces higher computational complexity and longer wall clock times. There are, however, cases where the standard length of a nonequilibrium transition is not sufficient, necessitating a complete rerun of the entire set of transitions. This significantly increases the computational cost and proves to be highly inconvenient during large-scale applications. Our findings provide a key set of recommendations that should be adopted for the reliable implementation of nonequilibrium approaches to relative binding free energy calculations in ligand-protein systems.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre
for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, U.K.
| | - Agastya P. Bhati
- Centre
for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, U.K.
| | - Peter V. Coveney
- Centre
for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, U.K.
- Advanced
Research Computing Centre, University College
London, London WC1H 0AJ, U.K.
- Computational
Science Laboratory, Institute for Informatics, Faculty of Science, University of Amsterdam, Amsterdam 1012 WP, Netherlands
| |
Collapse
|
3
|
Wan S, Sinclair RC, Coveney PV. Uncertainty quantification in classical molecular dynamics. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2021; 379:20200082. [PMID: 33775140 PMCID: PMC8059622 DOI: 10.1098/rsta.2020.0082] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/02/2020] [Indexed: 05/24/2023]
Abstract
Molecular dynamics simulation is now a widespread approach for understanding complex systems on the atomistic scale. It finds applications from physics and chemistry to engineering, life and medical science. In the last decade, the approach has begun to advance from being a computer-based means of rationalizing experimental observations to producing apparently credible predictions for a number of real-world applications within industrial sectors such as advanced materials and drug discovery. However, key aspects concerning the reproducibility of the method have not kept pace with the speed of its uptake in the scientific community. Here, we present a discussion of uncertainty quantification for molecular dynamics simulation designed to endow the method with better error estimates that will enable it to be used to report actionable results. The approach adopted is a standard one in the field of uncertainty quantification, namely using ensemble methods, in which a sufficiently large number of replicas are run concurrently, from which reliable statistics can be extracted. Indeed, because molecular dynamics is intrinsically chaotic, the need to use ensemble methods is fundamental and holds regardless of the duration of the simulations performed. We discuss the approach and illustrate it in a range of applications from materials science to ligand-protein binding free energy estimation. This article is part of the theme issue 'Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico'.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre for Computational Science, University College London, Gordon Street, London WC1H 0AJ, UK
| | - Robert C. Sinclair
- Centre for Computational Science, University College London, Gordon Street, London WC1H 0AJ, UK
| | - Peter V. Coveney
- Centre for Computational Science, University College London, Gordon Street, London WC1H 0AJ, UK
- Institute for Informatics, Science Park 904, University of Amsterdam, 1098 XH Amsterdam, The Netherlands
| |
Collapse
|
4
|
Wan S, Bhati AP, Zasada SJ, Coveney PV. Rapid, accurate, precise and reproducible ligand-protein binding free energy prediction. Interface Focus 2020; 10:20200007. [PMID: 33178418 PMCID: PMC7653346 DOI: 10.1098/rsfs.2020.0007] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/11/2020] [Indexed: 02/06/2023] Open
Abstract
A central quantity of interest in molecular biology and medicine is the free energy of binding of a molecule to a target biomacromolecule. Until recently, the accurate prediction of binding affinity had been widely regarded as out of reach of theoretical methods owing to the lack of reproducibility of the available methods, not to mention their complexity, computational cost and time-consuming procedures. The lack of reproducibility stems primarily from the chaotic nature of classical molecular dynamics (MD) and the associated extreme sensitivity of trajectories to their initial conditions. Here, we review computational approaches for both relative and absolute binding free energy calculations, and illustrate their application to a diverse set of ligands bound to a range of proteins with immediate relevance in a number of medical domains. We focus on ensemble-based methods which are essential in order to compute statistically robust results, including two we have recently developed, namely thermodynamic integration with enhanced sampling and enhanced sampling of MD with an approximation of continuum solvent. Together, these form a set of rapid, accurate, precise and reproducible free energy methods. They can be used in real-world problems such as hit-to-lead and lead optimization stages in drug discovery, and in personalized medicine. These applications show that individual binding affinities equipped with uncertainty quantification may be computed in a few hours on a massive scale given access to suitable high-end computing resources and workflow automation. A high level of accuracy can be achieved using these approaches.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, UK
| | - Agastya P. Bhati
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, UK
| | - Stefan J. Zasada
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, UK
| | - Peter V. Coveney
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, UK
- Computational Science Laboratory, Institute for Informatics, Faculty of Science, University of Amsterdam, 1098XH Amsterdam, The Netherlands
| |
Collapse
|
5
|
Bhati A, Wan S, Coveney PV. Ensemble-Based Replica Exchange Alchemical Free Energy Methods: The Effect of Protein Mutations on Inhibitor Binding. J Chem Theory Comput 2019; 15:1265-1277. [PMID: 30592603 PMCID: PMC6447239 DOI: 10.1021/acs.jctc.8b01118] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Indexed: 01/06/2023]
Abstract
The accurate prediction of the binding affinity changes of drugs caused by protein mutations is a major goal in clinical personalized medicine. We have developed an ensemble-based free energy approach called thermodynamic integration with enhanced sampling (TIES), which yields accurate, precise, and reproducible binding affinities. TIES has been shown to perform well for predictions of free energy differences of congeneric ligands to a wide range of target proteins. We have recently introduced variants of TIES, which incorporate the enhanced sampling technique REST2 (replica exchange with solute tempering) and the free energy estimator MBAR (Bennett acceptance ratio). Here we further extend the TIES methodology to study relative binding affinities caused by protein mutations when bound to a ligand, a variant which we call TIES-PM. We apply TIES-PM to fibroblast growth factor receptor 3 (FGFR3) to investigate binding free energy changes upon protein mutations. The results show that TIES-PM with REST2 successfully captures a large conformational change and generates correct free energy differences caused by a gatekeeper mutation located in the binding pocket. Simulations without REST2 fail to overcome the energy barrier between the conformations, and hence the results are highly sensitive to the initial structures. We also discuss situations where REST2 does not improve the accuracy of predictions.
Collapse
Affiliation(s)
- Agastya
P. Bhati
- Centre for Computational Science, Department
of Chemistry, University College London, 20 Gordon Street, London, WC1H 0AJ, United Kingdom
| | - Shunzhou Wan
- Centre for Computational Science, Department
of Chemistry, University College London, 20 Gordon Street, London, WC1H 0AJ, United Kingdom
| | - Peter V. Coveney
- Centre for Computational Science, Department
of Chemistry, University College London, 20 Gordon Street, London, WC1H 0AJ, United Kingdom
| |
Collapse
|
6
|
Eccleston RC, Wan S, Dalchau N, Coveney PV. The Role of Multiscale Protein Dynamics in Antigen Presentation and T Lymphocyte Recognition. Front Immunol 2017; 8:797. [PMID: 28740497 PMCID: PMC5502259 DOI: 10.3389/fimmu.2017.00797] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 06/22/2017] [Indexed: 12/15/2022] Open
Abstract
T lymphocytes are stimulated when they recognize short peptides bound to class I proteins of the major histocompatibility complex (MHC) protein, as peptide-MHC complexes. Due to the diversity in T-cell receptor (TCR) molecules together with both the peptides and MHC proteins they bind to, it has been difficult to design vaccines and treatments based on these interactions. Machine learning has made some progress in trying to predict the immunogenicity of peptide sequences in the context of specific MHC class I alleles but, as such approaches cannot integrate temporal information and lack explanatory power, their scope will always be limited. Here, we advocate a mechanistic description of antigen presentation and TCR activation which is explanatory, predictive, and quantitative, drawing on modeling approaches that collectively span several length and time scales, being capable of furnishing reliable biological descriptions that are difficult for experimentalists to provide. It is a form of multiscale systems biology. We propose the use of chemical rate equations to describe the time evolution of the foreign and host proteins to explain how the original proteins end up being presented on the cell surface as peptide fragments, while we invoke molecular dynamics to describe the key binding processes on the molecular level, including those of peptide-MHC complexes with TCRs which lie at the heart of the immune response. On each level, complementary methods based on machine learning are available, and we discuss the relationship between these divergent approaches. The pursuit of predictive mechanistic modeling approaches requires experimentalists to adapt their work so as to acquire, store, and expose data that can be used to verify and validate such models.
Collapse
Affiliation(s)
- R Charlotte Eccleston
- Centre for Computational Science, Department of Chemistry, University College London, London, United Kingdom
| | - Shunzhou Wan
- Centre for Computational Science, Department of Chemistry, University College London, London, United Kingdom
| | | | - Peter V Coveney
- Centre for Computational Science, Department of Chemistry, University College London, London, United Kingdom
| |
Collapse
|
7
|
Wan S, Knapp B, Wright DW, Deane CM, Coveney PV. Rapid, Precise, and Reproducible Prediction of Peptide-MHC Binding Affinities from Molecular Dynamics That Correlate Well with Experiment. J Chem Theory Comput 2015; 11:3346-56. [PMID: 26575768 DOI: 10.1021/acs.jctc.5b00179] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The presentation of potentially pathogenic peptides by major histocompatibility complex (MHC) molecules is one of the most important processes in adaptive immune defense. Prediction of peptide-MHC (pMHC) binding affinities is therefore a principal objective of theoretical immunology. Machine learning techniques achieve good results if substantial experimental training data are available. Approaches based on structural information become necessary if sufficiently similar training data are unavailable for a specific MHC allele, although they have often been deemed to lack accuracy. In this study, we use a free energy method to rank the binding affinities of 12 diverse peptides bound by a class I MHC molecule HLA-A*02:01. The method is based on enhanced sampling of molecular dynamics calculations in combination with a continuum solvent approximation and includes estimates of the configurational entropy based on either a one or a three trajectory protocol. It produces precise and reproducible free energy estimates which correlate well with experimental measurements. If the results are combined with an amino acid hydrophobicity scale, then an extremely good ranking of peptide binding affinities emerges. Our approach is rapid, robust, and applicable to a wide range of ligand-receptor interactions without further adjustment.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre for Computational Science, Department of Chemistry, University College London , London WC1H 0AJ, United Kingdom
| | - Bernhard Knapp
- Protein Informatics Group, Department of Statistics, University of Oxford , Oxford, OX1 3TG, United Kingdom
| | - David W Wright
- Institute of Structural and Molecular Biology, University College London , London WC1E 6BT, United Kingdom
| | - Charlotte M Deane
- Protein Informatics Group, Department of Statistics, University of Oxford , Oxford, OX1 3TG, United Kingdom
| | - Peter V Coveney
- Centre for Computational Science, Department of Chemistry, University College London , London WC1H 0AJ, United Kingdom
| |
Collapse
|
8
|
Wright DW, Hall BA, Kenway OA, Jha S, Coveney PV. Computing Clinically Relevant Binding Free Energies of HIV-1 Protease Inhibitors. J Chem Theory Comput 2014; 10:1228-1241. [PMID: 24683369 PMCID: PMC3966525 DOI: 10.1021/ct4007037] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Indexed: 11/28/2022]
Abstract
The use of molecular simulation to estimate the strength of macromolecular binding free energies is becoming increasingly widespread, with goals ranging from lead optimization and enrichment in drug discovery to personalizing or stratifying treatment regimes. In order to realize the potential of such approaches to predict new results, not merely to explain previous experimental findings, it is necessary that the methods used are reliable and accurate, and that their limitations are thoroughly understood. However, the computational cost of atomistic simulation techniques such as molecular dynamics (MD) has meant that until recently little work has focused on validating and verifying the available free energy methodologies, with the consequence that many of the results published in the literature are not reproducible. Here, we present a detailed analysis of two of the most popular approximate methods for calculating binding free energies from molecular simulations, molecular mechanics Poisson-Boltzmann surface area (MMPBSA) and molecular mechanics generalized Born surface area (MMGBSA), applied to the nine FDA-approved HIV-1 protease inhibitors. Our results show that the values obtained from replica simulations of the same protease-drug complex, differing only in initially assigned atom velocities, can vary by as much as 10 kcal mol-1, which is greater than the difference between the best and worst binding inhibitors under investigation. Despite this, analysis of ensembles of simulations producing 50 trajectories of 4 ns duration leads to well converged free energy estimates. For seven inhibitors, we find that with correctly converged normal mode estimates of the configurational entropy, we can correctly distinguish inhibitors in agreement with experimental data for both the MMPBSA and MMGBSA methods and thus have the ability to rank the efficacy of binding of this selection of drugs to the protease (no account is made for free energy penalties associated with protein distortion leading to the over estimation of the binding strength of the two largest inhibitors ritonavir and atazanavir). We obtain improved rankings and estimates of the relative binding strengths of the drugs by using a novel combination of MMPBSA/MMGBSA with normal mode entropy estimates and the free energy of association calculated directly from simulation trajectories. Our work provides a thorough assessment of what is required to produce converged and hence reliable free energies for protein-ligand binding.
Collapse
Affiliation(s)
- David W. Wright
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, United Kingdom
| | - Benjamin A. Hall
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, United Kingdom
| | - Owain A. Kenway
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, United Kingdom
| | - Shantenu Jha
- Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Peter V. Coveney
- Centre for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, United Kingdom
| |
Collapse
|
9
|
Shublaq N, Sansom C, Coveney PV. Patient-specific modelling in drug design, development and selection including its role in clinical decision-making. Chem Biol Drug Des 2013; 81:5-12. [PMID: 22765044 DOI: 10.1111/j.1747-0285.2012.01444.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Genomics has made enormous progress in the twelve years since the publication of the first draft human genome sequence, but it has not yet been translated into the clinic. Despite spiralling development costs, the number of new drug registrations is not increasing. One reason for this lies in the genetic complexity of disease. Most diseases involve dysregulation in pathways that involve many genes, and many (including most cancers) are themselves genetically heterogeneous. Systems biology involves the multi-level simulation of physiology, cell biology and biochemistry using complex computational techniques. We show here using case studies in cancer and HIV how such computational models, and particularly models based on individual patient data, can be used for drug design and development, and in the selection of the appropriate treatment for a given patient in the face of resistance mutations. If these techniques are to be adopted in routine clinical practice, clinicians will need better training in modern approaches to the integrated analysis of large-scale heterogeneous data and multi-scale models, while developers will need to provide much more usable tools. Investment in computational infrastructure is needed so that results can be returned on clinically relevant timescales and data warehouses designed with data protection as well as accessibility in mind.
Collapse
Affiliation(s)
- Nour Shublaq
- Centre for Computational Science & Computational Life & Medical Sciences Network, University College London, London, UK
| | | | | |
Collapse
|
10
|
Wan S, Wright DW, Coveney PV. Mechanism of drug efficacy within the EGF receptor revealed by microsecond molecular dynamics simulation. Mol Cancer Ther 2012; 11:2394-400. [PMID: 22863610 DOI: 10.1158/1535-7163.mct-12-0644-t] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The EGF receptor (EGFR) regulates important cellular processes including proliferation, differentiation, and apoptosis. EGFR is frequently overexpressed in a range of cancers and is associated with disease progression and treatment. Clinical studies have shown that EGFR mutations confer tumor sensitivity to tyrosine kinase inhibitors in patients with non-small cell lung cancer. In this study, we have conducted molecular dynamics simulations over several microseconds for wild-type and L858R mutant forms of EGFR in the ligand-free state. Close inspection of the conformations and interactions within the binding pocket reveals, converse to the wild type, that the mutant EGFR prefers to bind gefitinib, a targeted anticancer drug, rather than ATP, offering an explanation for why gefitinib is more effective in patients with EGFR mutations than those without.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre for Computational Science, Department of Chemistry, University College London, WC1H 0AJ, United Kingdom
| | | | | |
Collapse
|